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GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 



October 27, 2006, 20:20:01 ; Search time 318 Seconds 

(without alignments) 
2713.961 Million cell updates/sec 

US-10-552-515-1 
4950 

1 MRMAATAWAGLQGPPLPTLC. . . SELSSHWTPFTVPKASQLQQ 933 



Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 



Searched': 



2849598 seqs, 925015592 residues 



2849598 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 

UniProtJ7.2:* 
1: uniprot_sprot : * 
2: uniprot_trembl :* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 
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39 


1004 .5 


20. 


. 3 


966 


2 


Q4RHQ9 TETNG 


Q4rhq9 


tetraodon n 


40 


1001 


20. 


.2 


701 


2 
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Q8NA J0_HUMAN . 
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ALIGNMENTS 



RESULT 1 
Q6IWH7_HUMAN 

Q6IWH7_HUMAN PRELIMINARY; PRT; 933 AA. 
Q6IWH7; 

05-JUL-2004, integrated into UniProtKB/TrEMBL . 
05-JUL-2004, sequence version 1. 
07-FEB-2006, entry version 10. 
NGEP long variant. 
Name=TMEM16G; 
Homo sapiens (Human) . 

Eukaryota; Metazoa; Chordata; Crania ta; Vertebra ta; Euteleostomi; 
Mammalia;. Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; 
Homo . 

NCBI_TaxID=9606; 
[1] 

NUCLEOTIDE SEQUENCE . 

PubMed=14 981236; DOI=10 . 1073/pnas . 0308746101; 

Bera T.K., Das S., MaedaH., Beers R. , Wolfgang CD., Kumar V., 
Hahn Y. , Lee B., Pastan I.; 

"NGEP, a gene encoding a membrane protein detected only in prostate 
cancer and normal prostate."; 

Proc. Natl. Acad. Sci. U.S.A. 101:3059-3064(2004). 



Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 
Distributed under the Creative Commons Attribution-NoDerivs License 



EMBL; AY617079; AAT40139.1; -; mRNA. 
HGNC; HGNC: 31677; TMEM16G. 
GO; GO: 0005886; C:plasma membrane; IDA. 
InterPro; IPR007632; DUF590. 
InterPro; IPR002088; PPTA. 
Pfam; PF04 547; DUF590; 1. 
PROSITE; PS00904; PPTA; UNKNOWN_l . 

SEQUENCE 933 AA; 105531 MW; D6FD4 2578A4 1D7D3 CRC64 ; 

Query Match 100.0%; Score 4950; DB 2; Length 933; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 933; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MRMAAT AWAGLQGP PL PT LC PAVRTGLYCRDQAH AERWAMTS ET S SGS HC ARSRMLRRRA 60 

1 1 I I I I II I I 1 1 1 1 I Ml 1 1 I I 1 1 1 1 I I 1 1 I I I I I I 1 1 I I I I I I ! I I II I I I I I I I I I II 
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DR 
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1 


MRMAATAWAGLQGPPLPTLC PAVRTGLYCRDQAHAERWAMTSETSSGSHCARSRMLRRRA 


60 


Qy 


61 


QEEDSTVLIDVSPPEMKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLWEEDL 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ) 1 1 1 1 | 1 | | | 
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61 


QEEDST VL I DVS PPEAEKRGS YGSTAHASE PGGQQAAACRAGS PAKPR I ADFVLVWEE DL 


120 


Qy 
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KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 


180 






1 II 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 
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KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 


180 


Qy 


181 


YYAEDLRLKLPLQELPNQASNWSAGLLAWLGI PNVLLEWPDVPPEYYSCRFRVNKLPRF 240 






t i i i i i i i i i i i i i i i i i i i i i i i i l i l i t i i i i i i i i i I ii i i i i i i i i i i i i i i i i i i 
1 1 1 i 1 1 1 1 1 1 1 1 1 i 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 I I I 1 I I M 




Db 


181 


YYAEDLRLKLPLQELPNQASNWSAGLLAWLGI PNVLLEWPDVPPEYYSCRFRVNKLPRF 


240 


Qv 
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T <^QnMnnTrFTQTK'BI-inTT FFTT AKTPYnHFK'K>JT T C^TUCif T nm/T ciarDT unrDrirr 
IiooUiN^ui r r 1 ji ftKnyiiir til Jj/\j\.i r ionijftW\ijijuinyijij/\£joVjjo/\/\£ rljnUljir c J\l 


JUU 






1 1 1 1 1 1 1 1 J 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 i 1 1 1 1 J 1 1 1 1 1 1 1 1 1 1 1 1 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 I 1 1 1 | | | 




Db 
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LGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKT 


300 


Qy 
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pprnPOAPRT MDRDVT FOHWARWfiKWNIK'YOPT DHVR R Y FfiF WA T YFSWT rTTYTrXJT T Pn 


jdU 






1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 t I I l l l l l 
1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 




Db 


301 


PPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPA 


360 


Ov 


361 


AWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDH 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 II 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


AWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDH 


420 


Ov 


421 


GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPI 


480 






1 1 1 1 1 1 1 1 ! 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 




Db 


421 


GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPI 


480 


Qy 


481 


TGED E P YF PE RS RARRML AGS WI WMVAVWMC LVS 1 1 L YRAI MA I WS RS GNT L LAAW 


540 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 




Db 


481 


TGEDEPYFPERSPJ\RRMLAGSWI\A^AVVVMCLVSIILYRAIMAIWSRSGNTLLAAW 


540 


Qy 


541 


ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 


600 






1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 ! 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


• 541 


ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 


600 


Ov 


601 


S S P V Y I AF FKGR FVGY PGN Y HT L FGVRN E E C AAGGC LIELAQELLVI MVGKQ V I N N MQ EV 
i t i i i i i i i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 


660 


Db 


601 


1 1 M 1 1 1 1 1 1 1 1 II II 1 1 II 1 1 1 II 1 1 1 1 1 1 1 II 1 I! 1 1 1 1 II 1 1 II 1 1 1 1 M M 1 II 1 1 

SSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEV 660 


Ov 


661 


LIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTI 


720 






1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


661 


LIP KLKGWWQK FRL RS KK RKAG AS AG AS QGPWEDDYELVPCEGLFDEYL EMV LQ FG FVT I 720 


Ov 


721 


FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 


780 






1 1 1 ! 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 




Db 


721 


FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 


780 


Ov 


781 


AFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYS 


840 






1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I I 




Db 


781 


AFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYS 


840 


Qy 


841 


QTYWNLLAIRLAFVIVFEHVVFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENEVL 


900 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


841 


QTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENEVL 


900 


Qy 


901 


FGTNGTKDEQPKGSELSSHWTPFTVPKASQLQQ 933 








II 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


901 


FGTNGTKDEQPKGSELSSHWTPFTVPKASQLQQ 933 





RESULT 2 
Q6IFT5_M0USE 

ID Q6IFT5_MOUSE PRELIMINARY; PRT; 859 AA. 

AC Q6IFT5; 

DT 05-JUL-2004, integrated into UniProtKB/TrEMBL . 

DT 05-JUL-2004, sequence version 1. 

DT 07-MAR-2006, entry version 10. 

DE NGEP. 

GN Name=Ngep; 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Sciurognathi; 

OC Muroidea; Muridae; Murinae; Mus. 
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OX NCBI_TaxID=10090; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RX PubMed=14981236; DOI=10 . 1073/pnas .0308746101 ; 

RA Bera T.K., Das S., Maeda H. , Beers R. , Wolfgang CD., Kumar V., 

RA Hahn Y . , Lee B. , Pastan I.'; 

RT "NGEP, a gene encoding a membrane protein detected only in prostate 

RT cancer and normal prostate."; 

RL Proc. Natl. Acad. Sci. U.S.A. 101:3059-3064(2004). 

CC - !.- MISCELLANEOUS : The sequence shown here is derived from an 

CC EMBL/GenBank/DDBJ third party annotation (TPA) entry. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; BK004 075; DAA04 566.1; -; mRNA. 

DR Ensembl; ENSMUSG00000034 107 ; Mus musculus. 

DR MGI; MGI : 3052714 ; Tmeml6g. 

DR InterPro; IPR007632; DUF590. 

DR InterPro; IPR002088; PPTA. 

DR Pfam; PF04547; DUF590; 1. 

DR PR0SITE; PS00904; PPTA; UNKNOWN_l . 

SQ SEQUENCE 859 AA; 97128 MW; 82E1A4 73C59C8DA3 CRC64 ; 

Query Match 76.2%; Score 3771.5; DB 2; Length 859; 

Best Local Similarity 83.0%; Pred. No. 4.1e-302; 

Matches 716; Conservative 43; Mismatches 95; Indels 9; Gaps 4; 

Qy 55 MLRRRAQEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVL 114 

1 I I : I : I I I I llll:: | | | I I I I I j | I | | | | | | | | | | | | | I I 

Db 1 MLRGQAREEDSWLIDMASPEAGNGCSYGSTAQASEAGKQQVAPSRVGSSAKPPI-DFVL 59 

Qy 115 VWEEDLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSA 174 

I I I I I I I ! :: : I : I I I I I I I I I : I I Ml : I I llll I I I II I 
Db 60 VWEEDL RNQENPTKDKTDTHEVWRET FLENLCLAGLKIDQHDVQDEAAAVHYILLRA 116 

Qy 175 SWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRV 234 

I I I I M I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I : I I I III I I I I I I I : I : 
Db 117 PWAVLCYYAEDLRLKLPLQELPNQASNWSATLLEWLGIPNILLEHVPDTPPEYYSCQFKA 176 

Qy 235 NKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLH 294 

: I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 
Db 177 SKLQWFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKGLFGIDQLLAEGVFSAAFPLH 236 

Qy 295 DGPFKTPPEGPQAP RL NQ RQ VL FQ H WARWGKWNK YQ P L D H VRRY FGE KVAL Y FAWL G F YT 354 

llll II I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I \ \ I I I I I I I I I I I I I I 
Db 237 DGPFSAVPESSQVLGLIQRQVXFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYT 296 

Qy 355 GWLLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQA 414 

I I I I I I I I I I I : I I I I I I I I I I I I I I I I II I I llhlllll II I I I I I I I I I I U 
Db 297 GWLL PAAWGTWFLVGCFLVFSD I PTQELCH SS DS FDMC PLCS DCS FWLLS SACT LAQA 356 

Qy 415 GRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPM 4 74 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I : II I 

Db 357 GRLFDHGGTVFFSLFMALWAVLLLEYWKRKNATLAYRWDCSDYEDIEERPRPQFAATAPM 416 

Qy ■ 475 TAPNPITGEDEPYFPERSRARRMLAGSWIV^mVAVVVMCLVSIILYRAIMAIVVSRSGN 534 

II I I I I I I I I I I I I I : : I II I I I I I I I : : : I I I I I : I I I I I : I I I I I : I I I : I I I I I 

Db 417 TALNPITGEDEPYFPEKNRVRRMLAGSWLLMMVAVVIMCLVSVILYRAVMAIIVSRSDN 476 

Qy 535 TLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIF 594 

I : I II I I I I I I I I I I I I II I I I I I I I : I I II I I I I I I I I I I I I : I II I I I I I I I I I 
Db 477 AFLSAWASRIASLTGSWNLVFILILSKVYVLLAQVLTRWEMHRTQTEFEDAETLKVFIF 536 

Qy 595 QFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVI 654 

I I I I I I : I I I II I I I I I I I I I I I I I I I I I I I : I I 1 I I I I I I I I I I I I I I I I I I I I I : I 
Db 537 QFVNFYASPVYIAFFKGRFVGYPGNYHTLFGIRNEECPAGGCLSELAQELLVIMVGKQII 596 

Qy 655 NNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQ. 714 

I I : I I II : I I I I I I I I I I : : I I I I I I I I I I I : M I I I I I I I I I I I I 
Db 597 NNVQEVLVPKLKGCWQKF SRGKKAG— TGTHPAPWEADYELLPCEGLFHEYLEMVLQ 651 

Qy 715 FGFVT I FVAACPLAPLFALLNN WVE I RLDARKFVCE YRRPVAERAQD I GI WFH I LAGLT H 774 

I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I llll 
Db 652 FGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILTGLTH 711 
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Qy 775 LAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRD 834 

I I I t I I I I I I I I I I I II I I II II I II I I I I I I I I I I I : I : I I I I I I I I I I I I I 
Db 712 LAVISNAFLLAFSSDFLPRVYYSWTHAPDLHGFLNFTLARAPPTFTSAHNRTCRYRAFRD 771 

Qy 835 DDGHYSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQAL 894 

I I I I I I III I I I I I I I I I I I I I I I I I I : I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 772 DDGH YS PT YWTLLAI RLAFVI VFEHWFS I GRVLDLLVPD I PESVE I KVKREYYLAKQAL 831 

Qy 895 AENEVLFGTNGTKDEQPKGSELS 917 

.111111 111:11 III 
Db 832 AENEALLGATGVKDDQPPSSEPS 854 



RESULT 3 
Q6IFT6_RAT 



ID Q6IFT6_RAT PRELIMINARY; PRT; 860 AA. 

AC Q6IFT6; 

DT 05-JUL-2004, integrated into UniProtKB/TrEMBL . 

DT 05-JUL-2004, sequence version 1. 

DT 07-FEB-2006, entry version 8. 

DE NGEP . 

GN Name=Ngep; 

OS Rattus norvegicus (Rat) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Sciurognathi; 

OC Muroidea; Muridae; Murinae; Rattus. 

OX NCBI_TaxID=10116; 

RN [1] 

RP NUCLEOTIDE SEQUENCE . 

RX PubMed=14981236; DOI=10 . 1073/pnas .0308746101; 

RA Bera T.K./Das S., Maeda H., Beers R. , Wolfgang CD., Kumar V., 

RA Hahn Y. , Lee B., Pastan I.; 

RT "NGEP, a gene encoding a membrane protein detected only in prostate 

RT cancer and normal prostate."; 

RL Proc. Natl. Acad. Sci . U . S .A. 101 : 3059-3064 (2004 ) . 

CC -!- MISCELLANEOUS: The sequence shown here is derived from an 

CC EMBL/GenBank/DDBJ third party annotation (TPA) entry. 

CC — 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 



CC 

DR EMBL; BK004074; DAA04565.1; -; mRNA. 

DR Ensembl; ENSRNOG000000234 27 ; Rattus norvegicus. 

DR InterPro; IPR007632; DUF590. 

DR InterPro; IPR002088; PPTA. 

DR Pfam; PF04547; DUF590; 1. 

DR PR0SITE; PS00904; PPTA; UNKN0WN_1 . 

SQ SEQUENCE 860 AA; 97170 MW; 96BE3CBD6DE96101 CRC64 ; 



Query Match 76.0%; Score 3764; DB 2; Length 8 60; 

Best Local Similarity 82.7%; Pred.No. 1.7e-301; 

Matches 714; Conservative 47; Mismatches 94; Indels 8; Gaps 4; 



Qy 


55 


MLRRRAQEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVL 


114 






1 1 1 ::| 1 M 1 1 1 1 1 :: 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


1 


MLRKQAGEEDSWLIDMTSPEAGNGCSYGSTAQASEAGKQQVAPSRVGSSANPPI-DFVL 


59 


Qy 


115 


VWEEDLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSA 


174 






1 I 1 1 1 1 1 ::: : 1 : 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 : 1 1 : 1 1 1 1 1 1 1 1 1 1 1 




Db 


60 


VWEEDL RSRENPTQDKTDTHEIWRETFLENLRVAGLKIDQRDVQDEAAAVHYILLSA 


116 


Qy 


175 


SWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRV 


234 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 111111:111 III 1 1 1 1 1 ! 1 : 1 : 




Db 


117 


PWAVLCYYAEDLRLKLPLQELPNQASNWSATLLEWLGI PN ILLENVPDTPPEYYSCQFKA 


176 


Qy 


235 


NKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLH 


294 






: 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 1 1 




Db 


177 


SKLQWFLGSDNQDTFFTSTKRHQILFEILAKTPYGHQKKGLFGIDQLLAEGVFSAAFPLH 


236 


Qy 


295 


DGPFKTPPEGPQAPRLNQRQVL FQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYT 


354 






1 1 1 1 II 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


237 


DGPFSWPESSQVLGLTQRQVLFKHWARWGKWRKYQPLDHVRRYFGEKVALYFAWLGFYT 


296 


Qy 


355 


GWLLPAAWGTLVFLVGCFLVFSD I PTQELCGSKDS FEMCPLCLDCPFWLLS SACALAQA 


414 






1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 : 1 : 1 1 1 1 1 II 1 1 1 1 II 1 1 1 1 1 1 




Db 


297 


GWLLPAAWGTVVFLAGCFLVFSDVPTQELCHSSDTFDMC PLCS DCS FWLLSSACTLAQA 


356 
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Qy 415 GRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPM 4 74 

I I I I I I I I I I I I I I II I I I I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I : I I I 

Db 357 GRLFDHGGTVFFSLFMALWAVLLLEYWKRKNATLAYRWDCSDYEDIEERPRPQFAATAPM 416 

Qy 475 TAPNPITGEDEPYFPERSRARRMLAGSWIWMVAVWMCLVSIILYRAIMAIWSRSGN 534 

II I I I I I I I I I I I I I :: I I I I I I I I I I : : : I I I I I : I I I I I I I I I I I : I I I : I I : I I 

Db 417 TALNPITGEDEPYFPEKNRVRRMLAGSVVLLMMVAVVIMCLVSIILYRAVMAIIVSKSNN 476 

Qy 535 TLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIF 594 

1:11 II HUM II II I I II II IMI:| I II MINIMI II MIIIIIIMII 
Db ■ 477 AFLSAWASRIASLTGSWNLVFILILSKVYVILAQVLTRWEMHRTQTAFEDAFTLKVFIF 536 

Qy 595 QFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVI 654 

I I I I I I : I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I : I 
Db 537 QFVNFYAS PVTIAFFKGRFVGYPGNYHTLFGVRNEECPAGGCLSELAQELLVIMVGKQI I 596 

Qy 655 NNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQ 714 

I I : I I I I : I I I I I 111 I I : : : I I I II: III I I I I : I I I I I I I I I I I I I I 
Db 597 NNVQEVLVPKLKGCWQK — LCSRRKKAG — MGANPAPWEADYELLPCEGLFHEYLEMVLQ 652 

Qy 715 FGFVTI FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTH 774 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 

Db 653 FGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTH 712 

Qy 775 LAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRD 834 

I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I : I : I I II I I I I I I I I I 
Db 713 LAVISNAFLLAFSSDFLPRVYYSWTRAPDLRGFLNFTLARAPPTFTSAHNRTCRYRAFRD 772 

Qy 835 DDGHYSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQAL 894 

I I I I I I III I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 773 DDGHYSPTYWTLLAIRLAFVIVFEHWFSTGRFLDLLVPDIPESVEIKVKREYYLAKQAL 832 

Qy 895 AENEVLFGTNGTKDEQPKGSELS 917 

I : I I I I M ill III 
Db 833 ADNEALLGATGVKGEQPPSSEPS 855 



RESULT 4 
Q8NB39_HUMAN 

ID Q8NB3 9_HUMAN PRELIMINARY; PRT; 920 AA. 

AC Q8NB39; 

DT 01-OCT-2002, integrated into UniProtKB/TrEMBL. 

DT 01-OCT-2002, sequence version 1. 

DT 07-MAR-2006, entry version 16. 

DE CDNA FLJ34272 fis, clone FEBRA2003128 . 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chorda ta; Crania ta; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; 

OC Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Brain; 

RX PubMed=14702039; DOI=10 . 1038/ngl285; 

RA Ota T., Suzuki Y. , Nishikawa T., Otsuki T . ,. Sugiyama T., Irie R. , 

RA Wakamatsu A., Hayashi K . , Sato H., Nagai K. , Kimura K. , Makita H., 

RA Sekine M. , Obayashi M., Nishi T., Shibahara T., Tanaka T., Ishii S., 

RA Yamamoto J., Saito K. , Kawai Y., Isono Y. , Nakamura Y., Nagahari K. , 

RA Murakami K. , Yasuda T., Iwayanagi T., Wagatsuma M., Shiratori A., 

RA Sudo H., Hosoiri T., Kaku Y., Kodaira H., Kondo H.. , Sugawara M. , 

RA Takahashi M., Kanda K. , Yokoi T., Furuya T., Kikkawa E., Omura Y., 

RA Abe K. , Kamihara K. , Katsuta N . , Sato K., Tanikawa M., Yamazaki M., 

RA Ninomiya K. , Ishibashi T., Yamashita H., Murakawa K . , Fujimori K. , 

RA Tanai H., Kimata M.,' Watanabe M., Hiraoka S., Chiba Y., Ishida S., 

RA Ono Y., Takiguchi S., Watanabe S., Yosida M. , Hotuta T., Kusano J., 

RA Kanehori K. , • Takahashi-Fu jii A., Hara H., Tanase T.-O., Nomura Y., 

RA Togiya S., Komai F., Hara R., Takeuchi K. , Arita M. , Imose N . , 

RA Musashino K. , Yuuki H., Oshima A., Sasaki N . , Aotsuka S., 

RA Yoshikawa Y., Matsunawa H., Ichihara T., Shiohata N . , Sano S., 

RA Moriya S., Momiyama H., Satoh N., Takami S., Terashima Y., Suzuki 0., 

RA Nakagawa S., Senoh A., Mizoguchi H., Goto Y., Shimizu F., Wakebe H., 

RA Hishigaki H., Watanabe T., Sugiyama A., Takemoto M. , Kawakami B., 

RA Yamazaki M. , Watanabe K., Kumagai A., Itakura S., Fukuzumi Y., 

RA Fujimori Y., Komiyama M., Tashiro H., Tanigami A., Fujiwara T., 

RA Ono T., Yamada K. , Fujii Y. , Ozaki K., Hirao M. , Ohmori Y., 



http ://es/ScoreAccessWeb/GetItem .action? Appld= 1 05525 1 5&seqld=77563 O&ItemName. . . 11/1 7/2006 



RA Kawabata A., Hikiji T., Kobatake N . , Inagaki H., Ikema Y . , Okamoto S., 

RA Okitani R . , Kawakami T., Noguchi S., Itoh T., Shigeta K. , Senba T., 

RA Matsumura K., Nakajima Y. , Mizuno T., Morinaga M., Sasaki M . , 

RA Togashi T., Oyama M . , Hata H., Watanabe M. , Komatsu T., 

RA Mizushima-Sugano J., Satoh T., Shirai Y. , Takahashi Y. , Nakagawa K . , 

RA Okumura K. , Nagase T., Nomura N . , Kikuchi H., Masuho Y., Yamashita R. , 

RA Nakai K . , Yada T.,.Nakamura Y., Ohara 0., Isogai T., Sugano S.; 

RT "Complete sequencing and characterization of 21,243 full-length human 

RT cDNAs . " ; 

RL Nat. Genet. 36:40-45(2004). 

CC : 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AK091591; BAC03704.1; -; mRNA. 

DR Ensembl; ENSG00000151 572; Homo sapiens. 

DR HGNC; HGNC: 23837; TMEM16D. 

DR InterPro; IPR007632; DUF590. 

DR Pfam; PF04547; DUF590; 1. 

SQ SEQUENCE 920 AA; 107578 MW; 367 9B54 2A04 0D5D6 CRC64; 

Query Match 30.9%; Score 1531.5; DB 2; Length 920; 

Best Local Similarity 37.9%; Pred. No. 5.7e-117; 

Matches 360; Conservative 168; Mismatches 316; Indels 105; Gaps 29; 

TSSGSHCARSRMLRRRAQEEDSTVLID VSPPEAE KRGSYGST AHASEP 91 

: I I I : : : : I : I : I I III: I : I I I I 

SSSGITNGKTKVFHPVA— KDVNILFDELEAVSSPCKDDDSLLHPGNLTSTSDDASRLEA 61 



Qy 


44 


Db 


4 


Qy 


92 


Db 


62 


Qy 


147 


Db 


106 


Qy 


203 


Db 


165 


Qy 


254 


Db 


224 


wy 


314 


Db 


283 


Qy 


374 


Db 


343 


Qy 


433 


Db 


402 


Qy 


492 


Db 


462 


Qy 


546 


Db 


515 


Qy 


604 


Db 


574 


Qy 


663 


Db 


634 


Qy 


721 


Db 


691 



GGQQAAACRAGS - - PAKPRIAD FVL VWEEDLKLD RQQD S AARDRTDMH RT WRET FL D 14 6 

II: : | | | | : : | | : : : : | : | | | 

GGETVPERNKSNGLYFRDGKCRI-DYILVYRK SNPQTEK REVFER 105 

NLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQE LPNQASNW 202 

I : I I I I : : : : I : : : I I I I I I I I : : : : I : I I : 

NIRAEGLQMEKESSLI-NSDI IFVKLHAPWEVLGRYAEQMNVRMPFRRKIYYLPRRYKFM 164 

S AGLLAWLGIPNVLL — EWPDVPP-EYYSCRFRVNKLPRFLGSDNQDTFFTST 253 

I : I II : I I : I I : : |: I :: |: |::||| : 



KRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQR 313 

1:1: II: I I II : I : : : I I I I I I I I I : I :: : II 
TRSRIVHHILQRIKY-EEGKNKIGLNRLLTNGSYEAAFPLHEGSYRSKNSIRTHGAENHR 282 



: I : : II II I I I I I I I I II I I I I I : II I I | I I : I I I I III : I III I 



I II : I IN I I : I I : I I I : I I I I I : : I I I : 



II : Ihlll: I :|| II |:|: II Mill : lll:|: ill 



I I : I I : : I : :| I : :| I II I : : I 

JIFFMICWIAAVFGIVIYRWTV ST FAAFKWAL I RNN SQVA 514 



: I I : I I I :: I : : I : I :| I I I I : : : : I : : I I I I : I : I I I I I II 
T-TGTAVCINFCIIMLLNVLYEKVALLLTNLEQPRTESEWENSFTLKMFLFQFVNLNSST 573 

VYIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLI 662 
.111111111:111 I I I I I I I I I : I :: : I I I I I II I : 
FYIAFFLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGI IMVLKQTWNNFMELGY 633 

PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTI 7 20 
I :: I I : I ::: I I I I I I I I I I I I I I I II : I I I I I I I 

PLIQNWWTR RKVRQEHGPERKISFPQWEKDYNLQPMNAYGLFDEYLEMILQFGFTTI 690 

FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 780 
I I I I I I I I I I I I I I : I I I I I I Ml : : II I : I I I : II II |: II I : I : I I : I 
FVAAFPLAPLLALLNNIIEIRLDAYKFVTQWRRPLASRAKDIGIWYGILEGIGILSVITN 750 



http://es/ScoreAccessWeb/GetItem.action?AppId=l 05525 1 5&seqId=775630&ItemName.. . 1 1/1 7/2006 



Qy 781 AFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA RAP 816 

I I :: I : I I I : I I I : : I :: I : I : 

Db 751 AFVIAITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRISDFENRSEPESDG 810 

Qy 817 SS FAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLL 871 

II: : I I I I : I I I : : I : : I I I I I I : I I I I I : I | : I : | 

Db 811 SEFSGT PLKYCRYRDYRDPPHSLVPYGYTLQFWHVLAARLAFI I VFEHLVFC I KHL I SYL 870 

Qy 872 VPDIPESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELSSHW 920 

: I I : I : : : : : I I I I : : : I I : I : : I : I 

Db 871 IPDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGKAHHNEW 919 

RESULT 5 
Q32M4 5_HUMAN 

ID Q32M4 5_HUMAN PRELIMINARY; • PRT; 955 AA. 

AC Q32M4 5; 

DT 06-DEC-2005, integrated into UniProtKB/TrEMBL . 

DT 06-DEC-2005, sequence version 1. 

DT 07-FEB-2006, entry version 3. 

DE TMEM16D protein. 

GN Name=TMEM16D; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; 

OC Homo. 

OX NCBI_TaxID=9606; 

RN [1) 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=PCR rescued clones; 

RX MEDLINE=22388257; PubMed=12477932; DOI=10 . 1073/pnas . 242603899; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S . , Wagner L , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F . , 

RA Diatchenko L., Marusina K., Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M., Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S.,. Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E. , Ketteman M. , Madan A., Rodrigues S., Sanchez A., 

RA Whiting M. , Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A.; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences. 

RL Proc. Natl. Acad. Sci . U.S.A. 99:16899-16903(2002). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=PCR rescued clones; 

RG NIH MGC Project; 

RL Submitted (NOV-2005) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; BC109308; AAI09309.1; -; mRNA. 

SQ SEQUENCE 955 AA; 111462 MW; 9A934 8C61A4 F20AF CRC64 ; 

Query Match 30.8%;. Score 1525.5; DB 2; Length 955; 
Best Local Similarity 38.2%; Pred. No. 1.9e-116; 

Matches 355; Conservative 163; Mismatches 309; Indels 103; Gaps 28; 

Qy 63 EDSTVLID VSPPEAE KRGSYGST AHASEPGGQQAAACRAGS PA 105 

:| :| I III: |: II I III: ■: 

Db 56 KDVNILFDELEAVSSPCKDDDSLLHPGNLTSTSDDASRLEAGGETVPERNKSNGLYFRDG 115 

Qy 106 KPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET FLDNLRAAGLCVDQQDVQDGNT 165 

I I I r : : I I : : : : I : III I : I I I I : : : : I : 

Db 116 KCRI -DYILVYRK SNPQTEK REVFERN I RAEGLQMEKESSLI -NS 158 
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Qy 


1 cc 

loo 


TVHYALLSASWAVLCYYAEDLRLKLPLQE LPNQASNWS AGLLAWLGI PNV 

: : 1 1 1 II III : :::| : II : 1 : II : 
DI I FVKLHAPWE VLGRYALQMN VRMPF RRKIYYLPRRYKFMSRIDKQISRt RRWLPKKPM 


215 


Db 


1 CQ 

lo9 


218 


Qy 


z 1 0 


LL — EWPDVPP-EYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEK 
1 1 :||: : 1: 1 :: |: |::||| : 1 :|: II : 1 1 

KLUKbi LrULEENUUI i Art byyKlnnr 1-lHNKbi c b NN Al KbKl vnnl Lv?Rl KY-EEG 


272 


UD 


O 1 Q 


276 


Qy 


£ 1 J 


KNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPL 
1 1 : 1 : :: 1 1 1 1 1 1 1 1 1 : 1 : : : ' 1 1 : 1 : : 1 1 1 1 1 1 1 1 1 1 

KNKIuLNRLLI NGbYEAAr PLHEGbYRbKNbl RrHGAENHRHLLYECwASWGVWY KYQPL 


332 


Db 


z 1 1 


336 


Qy 


333 


•DHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFE 
1 I 1 1 I 1 ! 1 1 : 1 1 1 1 1 I 1 : 1 1 1 1 1 1 1 : 1 I I I I : : : I : I : | 
DL VR R Y FGE K I GL Y FAWL GWYT GM L F P AA F I G L FV F L Y GVTT L D H S QV S KE VC Q AT D 1 1 - 


392 


Db 


337 


395 


Qy 

Db 


393 
396 


MCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYR 
1 1 1 : 1 III 1 1 : 1 1 : 1 1 1 : 1 1 1 1 1 : : 1 1 1 : 1 1 : 1 I : 1 I I : I : ! I 
MCPVCDKYCP FMRLSDSCVYAKVT HL FDNGAT VFFAVFMAVWAT VFLE FWKRRRAV I AYD 


451 
455 


Qy 


452 


WDCSDYEDTEERPRPQFAAS-APMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAV 
II 1 : 1 : 1 1 1 1 1 1 1 : 1 1 1 : 1 : 1 1 1 : 1 : : : 1 . 1 : 1 
WDLIDWEEEEEEIRPQFEAKYSKKERMNPISGKPEPYQAFTDKCSRLIVSASGIFFMICV 


510 


Db 


4 56 


515 


Qy 


51 1 


WMCLVSI ILYRAIMAIWSRSGNTLLA-AWA SRIASLTGSW — NLVFILILSK 

1: : I:: I.I : :| 1 II |::|: ||: 1 1 |::|: 
VI AAVFGI VI YRWTV ST FAAFKWAL I RNN SQVAT -TGTAVC I N FC 1 1 MLLN V 


562 


Db 


51 6 


567 


Qy 


563 


I Y VS LA H V LT RW EMH RT QT K FE D A FT L KV F I FQFVN FY S S PVY I AF FKGR FVGY PGN Y HT 
:| :| :|| 1 1 1 : : : : 1 : : II 1 1 : 1 : 1 1 1 1 1 II 1(111 III I : I I 1 
LYEKVALLLTNLEQPRTE SEWENS FT LKMFLFQFVN LN S STFYIAFFLGRFTGHPGAYLR 


622 


Db 


568 


627 


Qy 


623 


L FG- VRN EEC AAGGCL I E LAQE LLV I MVGKQV I N NMQEVL I PKL KGWWQK FRLRS KKRKA 


681 


Db 


628 


1 1 1 1 1 1 1 1 1 : 1 : : : 1 1 I | I I I I : I : : I I : I : : : 
LINRWRLEECHPSGCLIDLCMQMGIIWLKQTWNNFME RKVRQEH 


684 


Qy 


682 


GASAGASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVE 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 I 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 : 1 
GPERKISFPQWEKDYNLQPMNAYGLFDEYLEMILQFGFTTIFVAAFPLAPLLALLNNIIE 


739 


Db 


685 


744 


Qy 


740 


IRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW- 
1 1 1 1 1 III : : 1 1 1 : 1 1 1 : 1 1 1 1 1 : II 1 : 1 : 1 1 : II 1 : : 1 : 1 1 1 : 1 1 1 : 
IRLDAYKFWQWRRPLASRAKDIGIWYGILEGIGILSVITNAFVIAITSDFIPRLVYAYK 


798 


Db 


74 5 


804 


Qy 


799 


TRAHDLRGFLNFTLA RAPS SFAAAHNRTCRYRAFRDD 

: 1 : : 1 : 1 : I 1 : : II 1 1 : 1 1 
YGPCAGQGEAGQKCMVGYVNASLSVFRISDFENRSEPESDGSEFSGTPLKYCRYRDYRDP 


835 


Db 


805 


864 


Qy 

Db 


836 
865 


DGH YSQTYWNLLAI RLAFVIVFEHWFSVGRLLDLLVPD I PESVE IKVKREYYLA 

1 : : 1 : : 1 1 1 1 1 1 : 1 1 1 1 1 : 1 1 : 1 : 1 : 1 1 : 1 : : : : : 1 1 1 1 
PHSLVPYGYTLQFWHVLAARLAFI IVFEHLVFCIKHLISYLI PDLPKDLRDRMRREKYLI 


890 
924 


Qy 


891 


KQALAENEVLFGTNGTKDEQPKGSELSSHW 920 
:: : 1 1 : 1 : : 1 : 1 




Db 


925 


QEMMYEAELERLQKERKERKKNGKAHHNEW 954 





RESULT 6 
TM16C_HUMAN 



ID TM16C_HUMAN STANDARD; PRT; 981 AA. 

AC Q9BYT9; 

DT 16-JAN-2004, integrated into UniProtKB/Swiss-Prot . 

• DT 01-JUN-2001, sequence version 1. 

DT 07-FEB-2006, entry version 20. 

DE Transmembrane protein 16C. 

GN Name=TMEM16C; Synonyms=Cl lorf 25; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi ; 

OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; 

OC Homo . 

OX NCBI_TaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [MRNA] . 

RA Rosier M.F., Toselli E. , Segurens-Soury B., Auffray C. , Devignes M.D.; 

RT "Predominant brain expression and full-length characterization of a 

RT novel human 6.6-Kb transcript mapping at llpl4 in the telomeric part 
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RT of WAGR locus."; 

RL Submitted (NOV-2000) to the EMBL/GenBank/DDBJ databases. 

CC SUBCELLULAR LOCATION: Membrane; multi-pass membrane protein 

CC (Probable). 

CC -!- SIMILARITY: Belongs to the TMEM16 family. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 
CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AJ300461; CAC32454.1; -; mRNA . 
DR Ensembl; ENSG000001 34 34 3; Homo sapiens. 
DR HGNC; HGNC: 14 004; TMEM16C. 
DR InterPro; IPR007632; DUF590. 
DR Pfam; PF04547; DUF590; 1. 
KW Membrane; Transmembrane. 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
FT 
SQ 



Best Local Similarity 39.4%; Pred. No. 1.2e-114; 

Matches 329; Conservative 163; Mismatches 268;- Indels 76; Gaps 24; 

^RIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNT 165 
I I I : : I ! : : I : : III I I I I I I : : : : : 



CHAIN 


1 


981 


Transmembrane protein 16C. 








/FTId=PRO 0000072565. 


' ' TRANS MEM 


398 


420 


Potential. 


TRANSMEM 


471 


'490 


Potential. 


TRANSMEM 


553 


575 


Potential. 


TRANSMEM 


590 


612 


Potential. 


TRANSMEM 


642 


664 


Potential. 


TRANSMEM 


759 


781 


Potential. 


TRANSMEM 


809 


831 


Potential. 


TRANSMEM 


904 


926 


Potential. 


> SEQUENCE 


981 AA; 


114655 


MW; 15A3276420912393 CRC64 ; 


Query Match 




30.4%; 


Score 1504; DB 1; Length 



Qy 


106 


Db 


161 


Qy 


166 


Db 


204 


Qy 


216 


Db 


264 


Qy 


271 


Db 


323 


Qy 


327 


Db 


376 


Qy 


387 


Db 


436 


Qy 


446 


Db 


495 


Qy 


505 


Db 


555 


Qy 


563 


Db 


613 


Qy 


623 


Db 


673 


Qy 


682 


Db 


728 


Qy 


740 



fLCYYAEDLRLKLPLQ ELPNQASNWSAGLLAWLGIPNV 215 

I I I I I I : : : I : : : : : I : : 



: I : II: : I : I :: I : : I : I I I I : : I : I : : : I : I I 



EKKNLLGIHQLLAEGVLSAAFPLHDGPFKT PPEGPQAPRLNQRQVLFQHWARWGKW 326 

I :|| :|: I Mil |:| :|: III I I :|:: Mill I 
SK VGIRKLINNGSYIAAFPPHEGAYKSSQPIKTHGPQ NNRHLLYERWARWGMW 375 

NKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCG 386 

I :! I I I : I. I I I I I : I M I I I I : I I I I : I I I : I I || I I : : :||:| 
YKHQPLDLIRLYFGEKIGLYFAWLGWYTGMLI PAAIVGLCVFFYGLFTMNNSQVSQEICK 4 35 

SKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKS 4 4 5 
: : I I I II I : I I : : I I : I M : I II M I : : I II : II : I I : I II : 



I I I I : : I : I I I I I I I I II II I : M : I : I : I 



I I : 1:1: M :| : |::|: 



:| :: I I I I I : :: : I :: I I I : I :| I I I I II Mill I I I I I : I I I 



II till IIICI :: III ||: II 



I II II : I : I I I I I I I I I I M I I II I I II I I II II I I M I I : I 

GI H-DAS I PQWENDWNLQPMNLHGLMDE YLEMVLQFGFTT I FVAAFPLAPLLALLNN HE 786 

IRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW- 7 98 
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Mill III ::|||: II Mill III: IIMMIMM :||::|| I 



Db 


787 


Qy 


799 


Db 


847 


Qy 


844 


Db 


907 



-TRAHDLRGFLNFTLARAP-SSFAAAHNRTCRYRAFR DDDGHYSQTY 843 

: |:|::| :|: I : MM M :: I 



|::M I I I I : : i I : : |:||:|: : :::M M :: : I I : 



RESULT 7 
Q6P5C6_MOUSE 

ID Q6P5C6_MOUSE PRELIMINARY; PRT; 956 AA. 

AC Q6P5C6; 

DT 05-JUL-2004, integrated into UniProtKB/TrEMBL . 

DT 05- JUL-2004., sequence version 1. 

DT 07-FEB-2006, entry version 12. 

DE AU040576 protein. 

GN Name-Tmeml6a; Synonyms=AU040576; 

OS Mus musculus (Mouse). 

OC Eukaryota; Metazoa; Chorda ta; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Sciurognathi; 

OC Muroidea; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP NUCLEOTIDE SEQUENCE . 

RC STRAIN=C57BL/6; TISSUE=Eye; 

RX MEDLINE=22388257; PubMed=12477932 ; DOI=10 . 1073/pnas . 242603899 ; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H.. , Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H.,. Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K. r Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M., Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T . E . ,. 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P.J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M., Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A.; 

RT "Generation and initial analysis of more than. 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci . U.S.A. 99:16899-16903(2002). 

RN (2] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=C57BL/6; TISSUE=Eye; 

RA Strausberg R.; 

RL Submitted (NOV-2003) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; BC062959; AAH62959.1; -; mRNA. 

DR Ensembl; ENSMUSG00000031075; Mus musculus. 

DR MGI; MGI:2142149; Tmeml6a. 

DR GO; GO: 0016021; C: integral to membrane; RCA. 

DR InterPro; IPR007632; DUF590. 

DR Pfam; PF04547; DUF590; 1. 

SQ SEQUENCE 956 AA; 110489 MW; 1 50FACCDBDA4AF25 CRC64 ; 

Query Match 30.3%; Score 14 98; DB 2; Length 956; 

Best Local Similarity 37.6%; Pred. No. 3.5e-114; 

Matches 360; Conservative 171; Mismatches 303; Indels 124; Gaps 28; 

Qy 26 GLYCRDQAHAERWAMT — SETSSGSHCARSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYG 8 3 

I I I I I : : : : I I M I I I : I : I : 

Db 52 GLYFRDGKRKVDYILVYHHKRASG S RT L AR RGLQN DMVL GTRS 94 

Qy 84 STAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET 143 
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II:: I I I I I :l : I III 



Db 


95 


VRQDQPLPG — KGS PVDAGS PEVP MDYHEDD KRFRREE 


130 


Qy 


144 


FLDNLRAAGLCVDQQDVQDGNTTVH YALLSASWAVLCYYAEDLRLKLPLQELPNQAS 

: II III:: 1 : 1 : 1 : : 1 1 1 1 1 1 1 1 : 1 1 : 1 : : : : : 
YEGNLLEAGLELE NDEDTKI HGVGFVKI HAPWHVLCREAEFLKLKMPTKKVYH ISE 


200 


Db 


131 


186 


Qy 


201 


NWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGS DNQDTFFT 

: 1 1 1 | : | | : : : 1 : 1 : 1 : 1 : : 1 : 1 1 
— TRGLLK — TINSVLQKITDPIQPKVAEHRPQTTKRLSYPFSREKQHLFDLTDRDSFFD 


251 


Db 


187 


242 


Qy 


252 


STKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLN 
1 1 1 : : 1 1 1 : 1 1 : : 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 : II ! 
SKTRSTIVYEILKRTTCTKAKYS-MGITSLLANGVYSAAYPLHDGDY EGDNV-EFN 


311 


Db 


243 


296 


Qy 


312 


QRQVL FQH WARWGKWN KYQP LD HVRR YFGE KVAL Y FAWLG FYTGWL L P AAWGT LV FLVG 
I : : 1 : : II : 1 : 1 1 1 1 : 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 M II 1 : 1 1 : : 1 1 : i 1 1 1 
DRKLLYEEWASYGVFYKYQPIDLVRKYFGEKVGLYFAWLGAYTQMLIPASIVGVIVFLYG 


371 


Db 


297 


356 


Qy 


372 


CFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFM 
1 1 : 1 1 : 1 : 1 : : Mill I : 1 : 1 1 1 1 1 1 : 1 III: 1 1 1 1 1 : 1 1 
CATVDENIPSMEMCDQRYNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVFM 


430 


Db 


357 


416 


Qy 


431 


ALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAA SAPMTAPNPITGEDE 

1 1 1 1 : 1 : 1 1 1 ! | | | I I : : | : i : | | : : | I : j | I : 
ALWAATFMEHWKRKQMRLNYRWDLTGFEEEEDHPRAEYEARVLEKSLRKESRNKET — DK 


485 


Db 


417 


474 


Qy 


486 


PYFPERSRARRMLAGSWIWMVAVWMCLVS 1 1 LYRAIMAI WSRSGNTLLAAWASRIA 

II 1 1 : 1 : 1 1 : : : 1 : 1 1 1 :: : : : : 
WLTWRDRFPAYFTNLVSII FMIAVTFAIVLGVIIYRISTAAALAMNSSPSVRSNIRVTV 


545 


Db 


475 


534 


Qy 


546 


SLTGSWNLVFI LI LSKI YVSLAHVLTRWEMHRTQTKFEDAFTLKVFI FQFVNFYSSPVY 
: 1 : : 1 II 1 : : 1 : : 1 : 1 1 .1 : 1 : : 1 : 1 1 : 1 1 1 : : 1 1 1 1 : 1 
TATAVI INLWI ILLDEVYGCIARWLTKIEVPKTEKSFEERLTFKAFLLKFVNSYTPIFY 


605 


Db 


535 


594 


Qy 

Db 


606 
595 


IAFFKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELAQELLV IMVGKQVI -NNMQEVLI P 
: 1 1 1 1 1 1 1 1 1 1 1 : 1 : 1 1 1 1 1 1 1 1 1 1 : 1 1 : 1 : 1 1 : 1 1 1 : 1 1 1 : 1 :. II 
VAFFKGRFVGRPGDYVYIFRSFRMEECAPGGCLMELCIQLSI IMLGKQLIQNNLFEIGIP 


663 
654 


Qy 


664 


KLKGWWQKFRLRSKKRKAGASAGASQGPWEDD YELVPCEGLFDEYLEMVLQFGFVT I FVA 
1 : 1 : : : 1 1 : : : 1 1 : 1 1 II 1 1 : 1 1 : : 1 1 1 1 1 1 : 1 1 1 
KMKKFI RYLKLRRQSPSDREEYVKRKQRYEVDFNLEPFAGLT PEYMEMI I QFGFVTLFVA 


723 


Db 


655 


714 


Qy 

Db 


724 
715 


AC PLAPLFALLNNWVE I RLDARKFVCEYRRPVAERAQD IGIWFH I LAGLT HLAV I SNAFL 
: 1 1 1 1 1 1 1 1. 1 1 1 : 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 : : 1 1 1 : 1 1 1 1 III: 
SFPLAPLFALLNNIIEIRLDAKKFVTELRRPVAIRAKDIGIWYNILRGVGKLAVIINAFV 


783 
774 


Qy 


784 


LAFSSDFLPRAYYRWTRAHD— LRGFLNFTLARAPSSF AAAHN R 

: : 1 : 1 1 1 : 1 1 1 : : : : 1 1 : 1 1 1 III II : 
ISFTSDFIPRLVYLYMYSQNGTMHGFVNHTL SSFNVSDFQNGTAPNDPLDLGYEVQ 


825 


Db 


775 


830 


Qy 

Db 


826 
831 


TCRYRAFRD DDGHY — SQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVE 

■I 1 1 : : 1 : :. 1 1 : : 1 : 1 1 1 1 1 1 1 1 1 1 : : : 1 : : 1 : : 1 1 1 1 : : 
ICRYKDYREPPWSEHKYDI SKDFWAVLAARLAFVIVFQNLVMFMSDFVDWVI PD I PKD I S 


880 
8 90 


Qy 


881 


IKVKREYYL AKQALAENEVLFGTNGTKDEQPKGSELSSHWTPFTVPKA 928 

:: :l 1 II 1 : : : |:|: ::| :| 1 |:| 
QQIHKEKVLMVELFMREEQGKQQLLDTWM EKEKPRDVPCNNH-SPTTHPEA 940 


Db 


891 



RESULT 8 
Q5XXA6_HUMAN 



ID Q5XXA6_HUMAN PRELIMINARY; PRT; 986 AA. 

AC Q5XXA6; 

DT 23-NOV-2004, integrated into UniProtKB/TrEMBL . 

DT 23-NOV-2004, sequence version 1. 

DT 07-FEB-2006, entry version 6. 

DE Tumor amplified and overexpressed sequence 2. 

GN Name=TMEM16A; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; 

OC Homo . 

OX NCBI_TaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE . 
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RA Huang X., Godfrey T.E., Gollin S.M.; 

RT "Comprehensive Analysis of the llql3 Amplicon in Oral Squamous Cell 

RT Carcinoma Cells and Synteny to Mouse Chromosome 7F5, a Band Amplified 

RT in Murine Oral Carcinoma."; 

RL Submitted (AUG-2004) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 



CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AY728143; AAU82085.1; -; mRNA . 

DR HGNC; HGNC: 21625; TMEM16A. 

DR InterPro; IPR007632; DUF590. 

DR Pfam; PF04547; DUF590; 1. 

SQ SEQUENCE 986 AA; 114078 MW; E30A02F91EF36FC2 CRC64 ; 



Query Match 29.9%; Score 1482; DB 2; Length 986; 

Best Local Similarity 36.8%; Pred. No. 7.7e-113; 

Matches 365; Conservative 162; Mismatches 302; Indels 164; Gaps 30; 



Qy 


26 


GLYCRDQAHAERWAMT — SETSSGSHCARSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYG 
1 1 1 1 1 : : : 1 1 : 1 1 1 1 1 1 : 1 
GLYFRDGRRKVDYILVYHHKRPSG NRTLVRRVQHSDTP SGA 


83 


Db 


52 


92 


Qy 


84 


STAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET 

: 1 : 1 : 1 1 I 1 : 1 : 1 II! 
RSVKQDHPLPGKGASLDAGSGEPP MDYHEDD KRFRREE 


143 


Db 


93 


130 


Qy 


144 


FLDNLRAAGLCVDQQDVQDGNTTVH YALLSASWAVLCYYAEDLRLKLPLQELPNQAS 

: II III:: : 1 : 1 : 1 : : 1 1 III 1 1 I :| I : I ::: : 
YEGNLLEAGLELE RDEDTKIHGVGFVKIHAPWNVLCREAEFLKLKMPTKKMYH — I 


200 


Db 


131 


184 


Qy 


201 


NWSAGLLAWLGI PNVLLEWPDVPPEYYSCR FRVNKLPRFLGSDNQDTFF 

1 : 1 1 1 1 : 1 1 :: : 1 : 1 1 1 1 II :|:|| 
NETRGLLK— KINSVLQKITDPIQPKVAEHRPQTMKRLSYPFSREKQHLFDLSD-KDSFF 


250 


Db 


185 


241 


Qy 


251 


TSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRL 

1 1 1 : : i 1 1 : 1 1 : : 1 1 1 1 1 1 1 : 1 1 : 1 1 1 1 1 : 
DSKTRSTIVYEILKRTTCTKAKYS-MGITSLLANGVYAAAYPLHDGDY NGENVEF 


310 


Db 


242 


295 


Qy 

Db 


311 
296 


N Q RQ VL FQHW ARWGKWN KYQ PL D H VR RY FG E KVA L Y FAWL G F YT GWL L P AA WGT L V FL V 
1 I::|:: : MUM ICIIIII: IIIMII II |:||::|| :IM 
NDRKLLYEEWARYGVFYKYQPIDLVRKYFGEKIGLYFAWLGVYTQMLI PASIVGIIVFLY 


370 
355 


Qy 


371 


GCFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLF 
II : : 1 1 : 1 : ! : : 1 1 1 1 1 1 : 1 : 1 1 1 1 1 1 : 1 III: 1 1 1 1 1 :. 1 
GCATMDENIPSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVF 


429 


Db 


356 


415 


Qy 

Db 


430 
416 


MALWAVLLLEYWKRKSATLAYRWDCSDYEDTEE RPRPQFAA SAPMTAPNPI 

1 1 1 II : 1 : 1 1 1 1 1 1 I 1 1 : : 1 : 1 1 1 1 : : 1 I ' : | 
MALWAATFMEHWKRKQMRLNYRWDLTGFEEEEEAVKDHPRAEYEARVLEKSLKKESRNK-, 


480 
474 


Qy 


481 


TGEDEPYFPERS RARRMLAG SWIWMVAVWM 

1:111 1 : :|| 1 |: |:| I 
— EKRRH I PEESTN KWKQRVKT AMAGVKLT DKVKLT WRDRFPAYLTNLVS 1 1 FM I AVT FA 


513 


Db 


475 


532 


Qy 


514 


CLVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSVVNLVFILILSKI YVSLAHVLTR 
: : : 1 : 1 1 fl : : : : : : : 1 : : 1 1 1 1 : : 1 : : | : I | | : 
IVLGVI IYRI SMAAALAMNSSPSVRSNIRVT VTATAVI INLWI ILLDEVYGCI ARWLTK 


573 


Db 


533 


592 


Qy 


574 


WEMHRTQTKFEDAFTLKVFI FQFVNFYSSPVYIAFFKGRFVGYPGNYHTLF-GVRNEECA 
1 : :| : 1 1 : 1 1 : : 1 1 1 1 : 1 : i 1 1 1 1 1 1 1 1 1 1 : 1 : | Mill 
IEVPKTEKSFEERLIFKAFLLKFVNSYTPIFYVAFFKGRFVGRPGDYVYI FRSFRMEECA 


632 


Db 


593 


652 


Qy 


633 


AGGCLI ELAQELLVIMVGKQVI -NNMQEVLI PKLKGWWQKFRLRSKKRKAGASAGASQGP 

1 1 1 1 : 1 1 : 1 : 1 1 : 1 1 1 : 1 1 1 : 1 : 1 1 1 : 1 ■ : : I : : : 
PGGCLMELCIQLSI IMLGKQLIQNNLFEIGIPKMKKLIRYLKLKQQSPPDHEECVKRKQR 


691 


Db 


653 


712 


Qy 


692 


WEDDYELVPCEGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEY 
: 1 II 1 1 1 1 1 1 : 1 1 :: 1 1 1 1 1 1 : 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 : 1 1 1 1 
YEVDYNLEPFAGLT PEYMEMI IQFGFVTLFVASFPLAPLFALLNNI IEIRLDAKKFVTEL 


751 


Db 


713 


772 


Qy 


752 


RR PVAERAQD I G I WFH I LAGLT H L AV I S N AFL LA FS S D FL PRA— YY RWT RAH DL RG FLN 
Mill l|:lllll::|| I: MM 1 1 1 : : : 1 : 1 1 1 : 1 1 1 ::: : MM 
RRPVAVRAKD IGIWYN ILRGIGKLAVI INAFVI S FTSDFI PRLVYLYMYS KNGTMHGFVN 


809 


Db • 


773 


832 


Qy 


810 


FT LARA PS SF AAAHN RTCRYRAFRD DDGH Y — SQTYWNLLA 


848 



http://es/ScoreAccessWeb/GetItem.action?AppId=105525 1 5&seqId=775630&ItemName.. . 1 1/1 7/2006 



Db 


833 


Qy 


849 


Db 


889 


Qy 


899 


Db 


947 



II III II : III: :|: : I |: :| :|| 

iTL SSFNVSDFQNGTAPNDPLDLGYEVQICRYKDYREPPWSENKYDISKDFWAVLA 888 

:RLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLA KQALAENE 898 

I I ! I M I I : : : I : :| ::||||: : :: :| | || | | 



'KDEQP KGSELSSH 919 

I I I I II II 



RESULT 9 
Q8IYY8_HUMAN 

ID' Q8IYY8_HUMAN PRELIMINARY; PRT; 840 AA. 

AC Q8IYY8; 

DT 01-MAR-2003, integrated into UniProtKB/TrEMBL. 

DT 01-MAR-2004, sequence version 2. 

DT 07-FEB-2006, entry version 13. 

DE TMEM16A protein. 

GN Name=TMEM16A; 

OS Homo sapiens (Human). 

OC EuJcaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; 

OC Homo . 

OX NCBI_TaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE . 

RC TISSUE=Testis; 

RX MEDLINE-22388257; PubMed=12477932 ; DOI=10 . 1073/pnas . 242603899; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F., 

RA Diatchenko L., Marusina K., Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M. , Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M., Madan A., Young A.C., Shevchenko Y., Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J. , Schmutz J., Myers R.M. , 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A.; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences."; 

RL Proc.,Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Testis; 

RG NIH MGC Project; 

RL Submitted (JUN-2002) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; BC033036; AAH33036.2; -; mRNA. 

DR Ensembl; ENSG000001 31620; Homo sapiens. 

DR InterPro; IPR007632; DUF590. 

DR Pfam; PF04547; DUF590; 1. 

SQ SEQUENCE 840 AA; 97654 MW; F8503B8F813CDA27 CRC64 ; 

Query Match 29.9%; Score 1479.5; DB 2; Length 840; 

Best Local Similarity 40.0%; Pred. No. 9.9e-113; 

Matches 340; Conservative 152; Mismatches 270; Indels 89; Gaps 22; 

Qy 135 DMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVH YALLSASWAVLCYYAEDLRLKLP 191 

I I II: I I III:: : I : I :| : : I I I I I I I I : I I : I 
Db 6 DDKRFRREEYEGNLLEAGLELE RDEDTKIHGVGFVKI HAPWNVLCREAEFLKLKMP 61 

Qy 192 LQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCR FRVNKLPRFL 241 

::: : I : III I :|| :: : |: I ! ! I 

Db 62 TKKMYH — INETRGLLK — KINSVLQKITDPIQPKVAEHRPQTMKRLSYPFSREKQHLFD 117 
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Qy 24 2 GSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTP 301 

I I :| :| I I I I : : I I I : I I : : I I I I I I I : I I : I I I I I : 
Db 118 LSD-KDSFFDSKTRSTIVYEILKRTTCTKAKYS-MGITSLLANGVYAAAYPLHDGDY 172 

Qy 302 PEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAA 361 

: I I :: I : : I I I : I : I I I I : I I I : ! I I I I : I I I I I I I I I I : I I : 
Db 17 3 — NGENVEFNDRKLLYEEWARYGVFYKYQPIDLVRKYFGEKIGLYFAWLGVYTQMLIPAS 230 

Qy 362 WGT L V FL VGC FLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSS AC AL AQ AG RL FD H 420 

: I I : I I I I I : : I I : I : I : : Mill' I : I : I I I I I I :| Ml: 
Db 231 IVGI IVFLYGCATMDENI PSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDN 290 

Qy 421 GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAA SAPMT 47 5 

I I I I I : I I I I I I : I : I I I I I I I I I : : I : I : M : : I I 
Db 291 PATVFFSVFMALWAATFMEHWKRKQMRLNYRWDLTGFEEEEDHPRAEYEARVLEKSLKKE 350 

Qy 476 APNPITGEDEPYFPERSRAJ^RMIJVGSWIVVMVAVVVMCLVSIILYRAIMAIVVSRSGNT 535 

' : I I I: II I I I: 1:11 : : : I : I I | ! : : : : . 

Db 351 SRNKET — DKVKLTWRDRFPAYLTNLVSII FMIAVTFAIVLGVI IYRISMAAALAMNSSP 408 

Qy 536 LLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQ 595 

: : : I : : I I I I : : I : : I : I I |: I : : I : II: I I : : 

Db 409 SVRSNIRVTVTATAVI INLWI ILLDEVYGCIARWLTKIEVPKTEKSFEERLIFKAFLLK 468 

Qy 596 FVNFYSSPVYIAFFKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELAQELLVIMVGKQVI 654 

I I I I : I : I I I I I I I I I I I : I : I I I I I I I I I I : I I : I : I I : I M : I 
Db 4 69 FVNSYTPI FYVAFFKGRFVGRPGDYVYI FRSFRMEECAPGGCLMELCIQLSI IMLGKQLI 528 

Qy 655 -NNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVL 713 

II: I : llh! : : I : : : : I I I I I II I I : I I : : 

Db 529 QNNLFEIGIPKMKKLIRYLKLKQQSPPDHEECVKRKQRYEVDYNLEPFAGLTPEYMEMI I 588 

Qy 714 QFGFVT I FVAACPLAPLFALLNNWVE IRLDARKFVCEYRRPVAERAQD IGIWFH ILAGLT 773 

I I I I I I : I I I = I I I I I I I I I I I : I I i I I I : I I I I I I I I I I I : I I I I I :: I I I : 
Db 589 QFGFVTLFVASFPLAPLFALLNNI IEIRLDAKKFVTELRRPVAVRAKDIGIWYNILRGIG 648 

Qy 774 HLAVISNAFLLAFSSDFLPRA — YYRWTRAHDLRGFLNFTLARAPSSF — AAAHN 824 

I I I I : I I : : : I : ! I I : I I I : : : : I I : I M III II 
Db 649 KLAVI I DAFVISFTSDFI PRLVYLYMYSKNGTMHGFVNHTL SSFNVSDFQNGTAPN 704 

Qy 825 ; RTCRYRAFRD DDGH Y — SQTYWNLLAI RLAFVI VFEHWFSVGRLLDL 870 

: I II : : I : : I I : : I : I I I I I I I I I I : : : I : : I 
Db 705 D P L D LG YE VQ ICRYKDYREPPWSENKYDISKD FWAVLAARLA FV I V FQN L VM FM S D FVDW 764 

Qy 871 LVPD I PESVEI KVKREYYLA KQALAENEVLFGTNGTKDEQP 911 

: : I I I I : : : : : I I III I III ! 

Db 765 VIPDIPKDISQQIHKEKVLMVELFMREEQDKQQLL — ETCMEKERQKDEPPCNHHNTKAC 822 

Qy 912 KGSELSSH 919 

II II 

Db 823 PDSLGSPAPSH 833 

RESULT 10 
Q8CFW1_M0USE 

ID Q8CFW1_M0USE PRELIMINARY; . PRT; 913 AA. 

AC Q8CFW1; 

DT 01-MAR-2003, integrated into UniProtKB/TrEMBL . 

DT 01-MAR-2003, sequence version 1. 

DT 07-FEB-2006, entry version 16. 

DE Transmembrane protein 16B. 

GN Name=Tmeml6b; 

OS Mus musculus (Mouse) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Mammalia; Eutheria; Euarchontoglires; Glires; Rodentia; Sciurogna thi; 

OC Muroidea; Muridae; Murinae; Mus. 

OX NCBI_TaxID=10090; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Eye; 

RX MEDLINE=22388257; PubMed=12477932 ; DOI=10 . 1073/pnas . 242603899; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L. , Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B. f Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J*. , Hsieh F., 
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RA Diatchenko L., Marusina K., Farmer A. A. , Rubin G.M., Hong L . , 

RA Stapleton M., Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., Usdin T.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J., McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S., Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E. , Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M., Madan A., Young A.C., Shevchenko Y . , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W., Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J., Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A.; 

RT . "Generation and initial analysis of more than 15,000 full-length human 
RT and mouse cDNA sequences."; 

RL Proc. Natl. Acad. Sci . U.S.A. 99:16899-16903(2002). 
RN [2] 

RP NUCLEOTIDE SEQUENCE. 
RC TISSUE=Eye; 
RA Strausberg R. ; 

RL Submitted (JUN-2002) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 
CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; BC033409; AAH33409.1; -; mRNA. 

DR Ensembl; ENSMUSG00000038115; Mus musculus. 

DR MGI; MGI: 2387214; Tmeml6b. 

DR GO; GO: 0016021; C: integral to membrane; IEA. 
DR InterPro; IPR007632; DUF590. 
DR Pfam; PF04547; DUF590; 1. 
KW Transmembrane. 

SQ SEQUENCE 913 AA; 104388 MW; CA17DB27D81 67F64 CRC64 ; 

Query Match 29.6%; Score 1467.5; DB 2; Length 913; 

Best Local Similarity 38.1%; Pred. No. l.le-111; 

Matches 331; Conservative 167; Mismatches 274; Indels 97; Gaps 22; 

KRGS YGSTAH— ASEPGGQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQQDSAARD 132 

III: : I I I I I : II : : I I 



Qy 


78 


Db 


20 


Qy 


133 


Db 


66 


Qy 


193 


Db 


121 


Qy 


241 


Db 


174 


Qy 


301 


Db 


233 


Qy 


361 


Db 


288 


Qy 


420 


Db 


348 


Qy 


469 


Db 


408 


Qy 


521 


Db 


4 68 


Qy 


578 



I I II I I : :: I :: : : Mill I I I :: I : I 



-PNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLP RF 240 

: I : I I : I I I : I I I I : : I . : I 

3GSIAKKFSA-ILQTLSSP LQPRV-PEHSNNRMKNLSYPFSREKMYL 173 



lilt: I : I : I I I : I I : I I : I : I : I I : I I I I I 



:| l::J:l I I I :J : |:||:| :|:|IH|: II Mil! II :|:|: 
-MN DRKLLYQEWARYGVFYKFQP I DL I RKYFGEKI GL YFAWLGLYT S FL I P S 287 



1:1:11111 : I I I : : I : I : : : I I I I I I I : I I 1 I I I > : I I I I 



'VFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEER PRPQF 4 68 

I I I I : I I I I I I : I I I I I I I I I : I = III II:: 



-AASAPMTAPNPITGE-DEPYFPERSRARRMLAGSWIWMVAVWMCLVSIIL 520 
: I : I : I III : I I I : I : I : : : I : ' 



II I : I I I I : : I : : I I I I I I I : I I : : I I I : I : 

YRITTAAALS LNKAT RSNVRVT VT AT AV 1 1 NL W I L I LDE I YGAVAKWLT KI EVP 522 



II: I I I : : I I I I I I : I I I I I I I I I I I : I : I I I I I I I I I I 
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Db 



523 KTEQTFEERLILKAFLLKFVNAYSPIFWAFFKGRFVGRPGSYVYVFDGYRMEECAPGGC 582 



Qy 637 LIELAQELLVIMVGKQVI -NNMQEVLI PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDD 695 

1:11 : I : I I : I I I : I I I : I : : I I I I ::| : :: :: |: ! 

Db 583 LMELCIQLSIIMLGKQLIQNNIFEIGVPKLKKLFRKLKDETEPGESDPDHSKRPEQWDLD 642 

Qy 696 YELVPCEGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPV 755 

: I I II I I : I I :: I I I I I I : I I I : I I I I : I I I I I I : I : I 1 I I: I M I III 
Db 64 3 HSLEPYTGLTPEYMEMIIQFGFVTLFVASFPLAPVFALLNNVIEVRLDAKKFVTELRRPD 702 

Qy 756 AERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAHD— LRGFLNFTLA 813 

I I : I I I I I I I I : I : : I I I I I : : I : I I I : I I I : : : : I : I I I : I I I : 
Db 703 AVRTKDIGIWFDILSGIGKFSVIINAFVIAVTSDFIPRLVYQYSYSHNGTLHGFWHTLS 7 62 

Qy 814 RAPSSFAAAHNRTCRYRAFRD DDGHYSQTYWNLLAIRLAFVIVF 857 

: :| : II:: :|: : : I : I I : : I : I I I I I I : I 

Db 763 FFNVSQLKEGTQPENSQFDQEVQFCRFKDYREPPWAPN PYEFSKQYWSVLSARLAFVI IF 822 

Qy 858 EHWFSVGRLLDLLVPDI PESVEI KVKRE 88 6 

: : : I : I : I : : I I II : : : I : I 
Db 823 QNLVMFLSVLVDWMIPDIPTDISDQIKKE 851 



RESULT 11 
TM16B HUMAN 



ID TM16B_HUMAN STANDARD; PRT; 999 AA. 

AC Q9NQ90; 

DT 16-JAN-2004, integrated into UniProtKB/Swiss-Prot . 

DT 01-OCT-2000, sequence version 1. 

DT 07-FEB-2006, entry version 23. 

DE Transmembrane protein 16B. 

GN Name=TMEM16B; Synonyms=C12orf 3; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chorda ta; Craniata; Vertebra ta; Euteleostomi; 

OC Mammalia; Eutheria; Eua rchon tog 1 ires; Primates; Catarrhini; Hominidae; 

OC Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [MRNA] . 

RC TISSUE=Retina; 

RA Lorenz B. , White K.E., Econs M.J., Strom T.M.; 

RT "Transcripts in 12pl3.3."; 

RL Submitted (FEB-2000) to the EMBL/GenBank/DDBJ databases. 

CC -!- SUBCELLULAR LOCATION: Membrane; multi-pass membrane protein 

CC (Probable). 

CC -!- SIMILARITY: Belongs to the TMEM16 family. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 



CC Distributed under the Creative Commons Attribution-NoDerivs License 



CC 










DR 


EMBL; AJ272204; 


CAC01125.1; 


-; mRNA. 


DR 


Ensembl; 


ENSG00000047617; Homo sapiens. 


DR 


HGNC; HGNC:1183; 


TMEM16B. 




DR 


LinkHub; 


Q9NQ90; 






DR 


InterPro; 


IPR007632; DUF590 




DR 


Pfam; PF04547; DUF590; 1. 




KW 


Membrane; 


Polymorphism; Transmembrane. 


FT 


CHAIN 


1 


999 


Transmembrane protein 16B. 


FT 








/FTId=PRO_0000072564 . 


FT 


TRANSMEM 


360 


382 


Potential . 


FT 


TRANSMEM 


535 


557 


Potential '. 


FT 


TRANSMEM 


577 


599 


Potential. 


FT 


TRANSMEM 


619 


641 


Potential . 


FT 


TRANSMEM 


746 


768 


Potential . 


FT 


TRANSMEM 


796 


818 


Potential . 


FT 


TRANSMEM 


898 


920 


Potential . 


FT 


VARIANT 


108 


108 


V -> A (in dbSNP: 3741903) . 


FT 








/FTId=VAR 021932. 


FT 


VARIANT 


501 


501 


S -> A (in dbSNP: 1860961) . 


FT 








/FTId=VAR 020331. 


SQ 


SEQUENCE 


999 AA; 113616 


MW; B9B4F56161AE1B00 CRC64 ; 



Query Match 29.6%; Score 1464; DB 1; Length 999; 

Best Local Similarity 37.4%; Pred. No. 2.4e-lll; 

Matches 344; Conservative 167; Mismatches 284; Indels 124; Gaps 27; 
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Qy 


80 


GSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRT 
1 1 II II . |::| 1: :| : 


1 39 


Db 


125 


154 


Qy 


140 


WRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQ- 

III II III : ::!::: : : Mill II |::|:| :: : 
-REEFEHNLMEAGLEL-EKDLENKSQGSI FVRIHAPWQVLAREAEFLKIKVPTKKEMYEI 


1 98 


Db 


155 


212 


Qy 


i y y 


ASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLP RFLGSDNQ 


24 6 


Db 


213 


1 :M 1 : : | | ||: : : : | : 
KAGGS I AKKFSAAL" QKLSSHLQPRV- PEHSNNKMKNLSYPFSREKMYLYN IQEK 


265 


Qy 


247 


DTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQ 
llll : 1 :|: II 1 :| I :||: |:| : l|:||||| : :| : 
DTFFDNATRSRIVHEILKRTACS-RANNTMGINSLIANNIYEAAYPLHDGEYDSPEDD-- 


306 


Db 


266 


322 


Qy 

Db 


307 
323 


A P RL N Q RQ VL FQ HVJ ARWG KWN K YQ P L D H VRRY FG E KVA L Y FAWLG F YT GWL L P AAWGT L 
: 1 1 : : 1 : 1 1 1 1 :| : 1 : 1 1 : 1 : 1 :| 1 1 1 1 : 1 1 1 1 1 1 1 II : 1 : I: :| : 1 : 
MNDRKLLYQEWARYGVFYKFQPIDLIRKYFGEKIGLYFAWLGLYTSFLIPSSVIGVI 


366 
37 9 


Qy 

Db 


367 
380 


VFLVGCFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVF 
1 1 1 1 1 : I 1 1 : : I : I : :: 1 1 1 1 1 I 1 : 1 1 1 1 1 1 III III: III 
VFLYGCAT I EED I P SREMCDQQNAFTMCPLCDKSCD YWNL SS ACGT AQAS HL FDN PAT VF 


425 
439 


Qy 


426 


FSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEER PRPQFAA 

1 1 : ! 1 1 1 1 1 : 1 1 1 1 1 MM: I : I I I | I : : 
FSIFMALWATMFLENWKRLQMRLGYFWDLTGIEEEEERAQEHSRPEYETKVREKMLKESN 


470 


Db 


440 


499 


Qy 


471 


-SAPMTAPNPIT GEDEPYFPERSRARRMLAGSWIWMVAVWMCLVSIILYRAIM 


525 


Db 


500 


II 1 : I 1 : 1 1 1 : 1 : 1 : : : 1 : I I 
QSAVQKLETNTTECGDEDDEDKLTWKDRFPGYLMNFASILFMIALTFSIVFGVIVYRITT 


559 


Qy 


526 


AIWSRSGNTLLAAWASRI ASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTK 

1 : 1 1 1 1 : : 1 : : 1 1 1 1 1 1 1 : 1 1 : : 1 1 1 : 1 : : I : 
AAALS LNKATRSNVRVTVTATAVIINLWILILDEIYGAVAKWLTKIEVPKTEQT 


582 


Db 


560 


614 


Qy 


583 


FEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELA 
II: 1 1 1 :: 1 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 1 1 : 1 : 1 1 ! llll 1111:11 
FEERLILKAFLLKFVNAYSPIFYVAFFKGRFVGRPGSYVYVFDGYRMEECAPGGCLMELC 


641 


Db 


615 


674 


Qy 


642 


QELLVI MVGKQVI -NNMQEVL I PKLKGWWQKFRLRS KKRKAGAS AGA- SQGP - - WEDD YE 

: 1 : 1 1 : 1 1 1 : 1 1 1 : 1 : : 1 1 1 1 : 1 1 : : 1 1. : 1 |: 1 1 : 1 1 
IQLS I IMLGKQLIQNN IFEIGVPKLK KL FRKL KD ET EAGE T D S AH S KH P E QWDL D Y S 


697 


Db 


67 5 


731 


Qy 

Db 


698 
732 


LVPCEGLFDEYLEMVLQFGFVT I FVAACPLAPLFALLNNWVE I RLDARKFVCEYRRPVAE 
1 1 II 1 1 : 1 1 :: 1 1 1 1 1 1 : 1 1 1 : 1 1 1 1 :| 1 1 1 1 1 : 1 : II 1 1 : 1 1 1 1 III 1 
LEPYTGLTPEYMEMIIQFGFVTLFVASFPLAPVFALLNNVIEVRLDAKKFVTELRRPDAV 


757 
791 


Qy 


758 


RAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAHD — LRGFLNFTLA — 
1 : III I 1 1 1 1 : 1 : : 1 III 1 1 : : 1 : 1 1 1 : 1 1 I : :: : 1 : 1 1 1 : 1 II: 
RTKDIGIWFDILSGIGKFSVISNAFVIAITSDFIPRLVYQYSYSHNGTLHGFVNHTLSFF 


813 


Db 


792 


851 


Qy 


814 


RAPSSFAAAHNRTCRYRAFRD DDGHYSQTYWNLLAIRLAFVIVFEH 

: : 1 : | | :: : |: : : I : I I : I : I I I I | 1 : | : : 
NVSQLKEGTQPENSQFDQEVQFCRFKDYREPPWAPN PYEFSKQYWFILSARLAFVI IFQN 


859 


Db 


852 


911 


Qy 


860 


WFSVGRLLDLLVPDIPESVEIKVKRE YYLAKQALAENEVLFGTNGTKDEQPKG 

:l : 1:1 ::|||| : ::|:t ::| : |:| I : II 
LVMFLSVLVDWMI PDI PTDI SDQI KKEKSLLVDFFLKE EHEKLKLMDEPALRSPGG 


913 


Db 


912 


967 


Qy 


914 


SELSSHWTPFTVPKA-SQL 931 




Db 


968 


: 1 : 1 III 
GDRSRSRAAS SAPSGQSQL 986 




RESULT 
TM16E 


12 
HUMAN 







ID TM16E_HUMAN STANDARD; PRT; 913 AA. 

AC Q7 5V66; 

DT 13-SEP-2005, integrated into UniProtKB/Swiss-Prot . 

DT 05-JUL-2004, sequence version 1. 

DT 07-FEB-2006, entry version 12. 

DE Transmembrane protein 16E (Gnathodiaphyseal dysplasia 1 protein) . 

GN Name=TMEM16E; Synonyms=GDDl ; 

OS Homo sapiens (Human) . 

OC Eukaryota; Metazoa; Chorda ta; Craniata; Vertebra ta; Euteleostomi; 
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OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae; 

OC Homo. 

OX NCBI_TaxID=9606; 

RN [1] 

RP NUCLEOTIDE SEQUENCE [MRNA] , TISSUE SPECIFICITY, SUBCELLULAR LOCATION, 

RP VARIANTS GDD GLY-356 AND ARG-356, AND CHARACTERIZATION OF VARIANTS GDD 

RP ' GLY-356 AND ARG-356. 

RC TISSUE=Skeletal muscle; 

RX PubMed=15124103; DOI=10. 1086/421527; 

RA Tsutsumi S., Kamata N., Vokes T.J., Maruoka Y., Nakakuki K. , 

RA Enomoto S., Omura K. , Amagasa T., Nagayama M. , Saito-Ohara F., 

RA Inazawa J., Moritani M. , Yamaoka T., Inoue H., Itakura M. ; 

RT "The novel gene encoding a putative transmembrane protein is mutated 

RT in gnathodiaphyseal dysplasia (GDD)."; 

RL Am. J. Hum. Genet. 74:1255-1261(2004). 

RN [2] 

RP TISSUE SPECIFICITY. 

RX PubMed=15067359; 

RA Katoh M . , Katoh M.; 



RT "Identification and characterization of TMEM16E and TMEM16F genes in 

RT silico."; 

RL Int. J. Oncol. 24:1345-1349(2004). 

CC -!- SUBCELLULAR LOCATION: Membrane; multi-pass membrane protein 
CC (Probable). Endoplasmic reticulum. Co-localized with 

CC CALR/calreticulin . 

CC -!- TISSUE SPECIFICITY: Highly expressed in brain, heart, kidney, 
CC lung, and skeletal muscle. Weakly expressed in bone marrow, fetal 

CC liver, placenta, spleen, thymus, osteoblasts and periodontal 

CC , ligament cells. 

CC -!- DISEASE: Defects in TMEM16E are the cause of gnathodiaphyseal 
CC dysplasia (GDD) (MIM: 166260 ] ; also called osteogenesis imperfecta 

CC with unusual skeletal lesions or gnathodiaphyseal sclerosis. GDD 

CC is a rare skeletal syndrome characterized by bone fragility, 

CC sclerosis of tubular bones, and cemen to-osseous lesions of the 

CC jawbone. Patients experience frequent bone fractures caused by 

CC trivial accidents in childhood; however the fractures healed 

CC normally without bone deformity. The jaw lesions replace the 

CC tooth-bearing segments of the maxilla and mandible with fibrous 

CC connective tissues, including various amounts of cementum-like 

CC calcified mass, sometimes causing facial deformities. Patients 

CC also have a propensity for jaw infection and often suffer from 

CC purulent osteomyelitis-like symptoms, such as swelling of and pus 

CC discharge from the gums, mobility of the teeth, insufficient 

CC healing after tooth extraction and exposure of the lesions into 

CC the oral cavity. 

CC -!- SIMILARITY: Belongs to the TMEM16 family. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 



CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; AB125267; BAD17859.1; -; .mRNA. 

DR Ensembl; ENSG00000171714; Homo sapiens. 

DR HGNC; HGNC: 27 337; TMEM16E. 

DR MIM; 166260; phenotype. 

DR MIM; 608662; gene. 

DR InterPro; IPR007632; DUF590. 

DR Pfam; PF04547; DUF590; 1. 

KW Disease mutation; Endoplasmic reticulum; Glycoprotein; Membrane; 

KW Transmembrane. 



FT 


CHAIN 


1 


913 


Transmembrane protein 16E. 


FT 








/FTId=PRO_0000191755. 


FT 


T0P0 DOM 


1 


299 


Cytoplasmic (Potential). 


FT 


TRANSMEM 


300 


320 


Potential. 


FT 


T0P0 DOM 


321 


380 


Extracellular (Potential). 


FT 


TRANSMEM 


381 


401 


Potential. 


FT 


T0PO_D0M 


402 


462 


Cytoplasmic (Potential). 


FT 


TRANSMEM 


463 


483 


Potential . 


FT 


T0PO_D0M 


484 


511 ■ 


Extracellular (Potential). 


FT 


TRANSMEM 


512 


532 


Potential . 


FT 


T0P0_D0M 


533 


557 


Cytoplasmic (Potential). 


FT 


. TRANSMEM 


558 


578 


Potential . 


FT 


T0P0_D0M 


579 


679 


Extracellular (Potential). 


FT 


TRANSMEM 


680 


700 


Potential . 


FT 


TOPO DOM 


701 


732 


Cytoplasmic (Potential). 


FT 


TRANSMEM 


733 


753 


Potential. 


FT 


TOPO DOM 


754 


834 


Extracellular (Potential). 
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FT 


TRANSMEM 


835 


855 


Potential . 


FT 


TOPO DOM 


856 


913 


Cytoplasmic (Potential) . 


FT 


CARBOHYD 


335 


335 


N-linked (GlcNAc. . .) (Potential). 


FT 


CARBOHYD 


366 


366 


N-linked (GlcNAc. . .) (Potential). 


FT 


CARBOHYD 


380 


380 


N-linked (GlcNAc. . .) (Potential). 


FT 


CARBOHYD 


7 68 


768 


N-linked (GlcNAc. . .) (Potential). 


FT 


CARBOHYD 


778 


778 


N-linked (GlcNAc. . .) (Potential). 


FT 


CARBOHYD 


791 


791 


N-linked (GlcNAc. . .) (Potential). 


FT 


VARIANT 


356 


356 


C -> G (in GDD; decreased cell adhesion 


FT 








and changed the cell morphology to a 


FT 








round shape) . 


FT 








• /FTId=VAR_023524 . 


FT 


VARIANT 


356 


356 


C -> R (in GDD; decreased cell adhesion 


FT 








and changed the cell morphology to a 


FT 








round shape) . 


FT 








/ FT I d= VAR 023525. 


SQ 


SEQUENCE 


913 AA; 


107188 


MW; 98BC40318678C073 CRC64 ; 



Query Match 29.4%; Score 14 55; DB 1; Length 913; 

Best Local Similarity 38.6%; Pred. No. 1.2e-110; 

Matches 325; Conservative 154; Mismatches 276; Indels 86; Gaps 22; 



Qy 


108 


RIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGL CVDQQDVQDGN 

1 1 1 1 1 : :| :| 1 :: | : | I I I I I 1 :: 1 : 1 1 
RQIDFVLSYVDDVKKD : AELKAERRKEFETNLRKTGLELEIEDKRDSEDGR 


164 


Db 


78 


127 


Qy 


165 


TTVHYALLSASWAVLCYYAEDLRLKLPLQE — LPNQASNWSAGLLAWLGI PNVLLEWPD 
1 : : : 1 1 1 1 1 1 1 1 : |: |::| : 1 : : 1 : : 1 I 1 
T — YFVKIHAPWEVLVTYAEVLGIKMPIKESDIPRPKHTPISYVLGPVRLP — LSVKYPH 


222 


Db 


128 


183 


Qy. 


223 


VPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEK-KNLLGIHQL 
I 1 1 : : : 1 : : II 1 1 1 1 1 1 : 1 : : 1 : : 1 1 : : 1 : 1 1 1 1 1 : 1 
— PEYFTAQFSRHRQELFLI ED-QATFFPSSSRNRIVYYI LSRCPFGI EDGKKRFGI ERL 


281 


Db 


184 


240 


Qy 


282 


LAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGE 
1 1 : 1 : 1 1 1 ! 1 : III 1 : 1 1 1 : 1 1 1 : : 1 1 1 1 1 : : 1 : 1 1 
LNSNTYSSAYPLHDGQYWKPSEPPNP — TNERYTLHQNWARFSYFYKEQPLDLIKNYYGE 


341 


Db 


241 


298 


Qy 


342 


KVAL YFAWLGFYTGWLL PAAWGT LVFL VGC FLV FS D I PT QE LCGS K— DS FEMC PLCL D 


399 


Db 


299 


KIGIYFVFLGFYTEMLFFAAWGLACFIYGLLSMEHNTSSTEICDPEIGGQMIMCPLCDQ 358 


Qy 


400 


-CPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYE 
1 : 1 1 : 1 1 : : III: 1 1 1 1 :: 1 1 : 1 1 1 1 : 1 1 : : 1 1 ! 1 1 1 : 1 
VCDYWRLNSTCLASKFSHLFDNESTVFFAIFMGIWVTLFLEFWKQRQARLEYEWDLVDFE 


458 


Db 


359 


418 


Qy 


459 


DTEE--RPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAVWMCLV 
: : : : 1 1 : 1 1 1 :| 1 III 1 : 1 : : : :: II : I 
EEQQQLQLRPEFEAMCKHRKLNAVTKEMEPYMPLYTRIPWYFLSGATVTLWMSLVVTSMV 


516 


Db 


419 


478 


Qy 


517 


SIILYRAIMAIWSRSGNTLLAAWASRI ASLTGSWNLVFIL 

:: 1 : II :: | : | | : 1 1 1 ! 1 : 1 : 1 1 
AVIVYRL SVFATFASFMESDASLKQVKSFLTPQITTSLTGSCLNFIVIL 


558 


Db 


479 


527 


Qy 

Db 


559 
528 


ilskiyvslahvltrwemhrtqtkfedaftlkvfifqeVnfysspvyiaffkgrfvgypg 

II: 1 : : : 1 : |: 1 1 : : 1 : 1 1 1 : 1 ■: 1 1 I 1 1 1 1 1 1 1 : 1 1 1 1 1 : 1 1 1 1 1 1 

ilnffyekisawitkmeiprtyqeyessltlkmflfqfvnfysscfyvaffkgkfvgypg 


618 
587 


Qy 


619 


nyhtlfgv-rneecaaggclielaqellvimvgkqvinnmqevlipklkgwwqkfrlrsk 

1 II 1 : 1 1 1 1 I 1 1 1 1 1 : 1 : i 1 1 1 1 : 1 : : 1 : 1 II : 

kytylfnewrseecdpggclielttqlt i imtgkqi fgni keai yplalnww R 


677 


Db 


588 


64 0 


Qy 

Db 


678 
641 


krkagasagasqgpweddyelvpce— glfdeylemvlqfgfvtifvaacplaplfalln 

: i 1 1 : : 1 1 1 :: 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 : 1 1 1 1 1 1 1 : I 

rrkartnseklysrweqdhdlesfgplglfyeyletvtqfgfvtlfvasfplapllalin 


735 
700 


Qy 


736 


nwveirldarkfvceyrrpvaeraqdigiwfhilaglthlavisnafllafssdflpray 

11111:111 : 1 1 1 1 1 : 1 1 1 : 1 III: 1 : 1 : 1 1 1 : : 1 1 : 1 1 : 1 1 

niveirvdawklttqyrrtvaskahsigvwqdilygmavlsvatnafivaftsdiiprlv 


795 


Db 


701 


760 


Qy 


796 


YRW TRAHDLRGFLN FTLARAPSSFAAAHNR TCRYRAFR DDDGHY- 

1 : : 1 : : 1 1 : 1 .1 : 1 : 1 1 1 1 1 1 : 1 II : 1 

yyyaystnatqpmtgyvnnslsvfliadfpnhtapsekrdfitcryrdyryppddenkyf 


839 


Db 


761 


820 


Qy 


840 


-SQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDI PESVEIKVKREYYLAKQALAENE 


898 
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Db 



821 HNMQFWHVLAAKMTFI IVMEHVVFLVKFLLAWMIPDVPKDWERIKREKLMTIKILHDFE 880 



Qy 899 V 899 

Db 881 L 881 



RESULT 13 
Q6DDQ3_XENLA 

ID Q6DDQ3_XENLA PRELIMINARY; PRT; 896 AA. 

AC Q6DDQ3; 

DT 16-AUG-2004, integrated into UniProtKB/TrEMBL . 

DT 16-AUG-2004, sequence version 1. 

DT 07-FEB-2006, entry version 12. 

DE Tmeml6e-prov protein. 

GN Name=tmeml6e-prov; 

OS Xenopus laevis (African clawed frog) . 

OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; 

OC Amphibia; Batrachia; Anura; Mesobatrachia ; Pipoidea; Pipidae; 

OC Xenopodinae; Xenopus; Xenopus. 

OX NCBI_TaxID=8355; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Spleen; 

RX MEDLINE=22341132; PubMed=12454917; DOI=10 . 1002/dvdy. 10174 ; 

RA Klein S.L., Strausberg R.L., Wagner L., .Pontius J., Clifton S.W., 

RA Richardson P.; 

RT "Genetic and genomic tools for Xenopus research: The NIH Xenopus 

RT initiative."; 

RL Dev. Dyn. 225:384-391(2002). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Spleen; 

RX MEDLINE=22388257; PubMed=12477932; DOI=10 . 1073/pnas . 242603899; 

RA Strausberg R.L., Feingold E.A., Grouse L.H., Derge J.G., 

RA Klausner R.D., Collins F.S., Wagner L., Shenmen CM., Schuler G.D., 

RA Altschul S.F., Zeeberg B., Buetow K.H., Schaefer C.F., Bhat N.K., 

RA Hopkins R.F., Jordan H., Moore T., Max S.I., Wang J., Hsieh F. , 

RA Diatchenko L., Marusina K., Farmer A. A., Rubin G.M., Hong L., 

RA Stapleton M., Soares M.B., Bonaldo M.F., Casavant T.L., Scheetz T.E., 

RA Brownstein M.J., UsdinT.B., Toshiyuki S., Carninci P., Prange C, 

RA Raha S.S., Loquellano N.A., Peters G.J., Abramson R.D., Mullahy S.J., 

RA Bosak S.A., McEwan P. J. , McKernan K.J., Malek J. A., Gunaratne P.H., 

RA Richards S. , Worley K.C., Hale S., Garcia A.M., Gay L.J., Hulyk S.W., 

RA Villalon D.K., Muzny D.M., Sodergren E.J., Lu X., Gibbs R.A., 

RA Fahey J., Helton E., Ketteman M., Madan A., Rodrigues S., Sanchez A., 

RA Whiting M., Madan A., Young A.C., Shevchenko Y. , Bouffard G.G., 

RA Blakesley R.W., Touchman J.W. , Green E.D., Dickson M.C., 

RA Rodriguez A.C., Grimwood J., Schmutz J. , Myers R.M., 

RA Butterfield Y.S.N. , Krzywinski M.I., Skalska U., Smailus D.E., 

RA Schnerch A., Schein J.E., Jones S.J.M., Marra M.A.; 

RT "Generation and initial analysis of more than 15,000 full-length human 

RT and mouse cDNA sequences.";- 

RL Proc. Natl. Acad. Sci. U.S.A. 99:16899-16903(2002). 

RN [3] 

RP NUCLEOTIDE SEQUENCE. 

RC TISSUE=Spleen; 

RA Klein S., Gerhard D.S.; 

RL Submitted (JUL-2004) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; BC077486; AAH77486.1; -; mRNA. 

DR InterPro; IPR007632; DUF590. 

DR Pfam; PF04547; DUF590; 1. 

SQ SEQUENCE 896 AA; 105167 MW; EBFCF8284 11C8C50 CRC64 ; 

Query Match 29.3%; Score 144 9; DB 2; Length 896; 

Best Local Similarity 37.1%; Pred. No. 3.6e-110; 

Matches 334; Conservative 167; Mismatches 297; Indels 102; Gaps 27; 

Qy 59 RAQEEDSTVLIDVSPPEAEKRGSY GSTAHASEPGGQQAAACRAGSPAKPRIADFVLV 115 

I : I I I I : : I I I : I ■ II : : I I 

Db 7 RREEE TLIEMSVTGDESNGALLDNNSITDSELPGNSEI DKHVQSKDSVFF 56 
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Qy 116 WEEDLKLD RQQDSAARDRTDMHRTWRET FLDNLRAAGLCVDQQDVQDG-N 164 

I: ::| :: : I I I I I I : I I : : : I : I I 

Db 57 WDGIRRIDFILSYTDETNKEAEKKAERRRD FEFNLHKSGLELETEDKKDSEN 108 

Qy 165 TTVHYALLSASWAVLCYYAEDLRLKLPLQ--ELPNQASNWSAGLLAWLGIPNVLLEVVPD 222 

:: : I I II III I :|:||: :| ::: I : :| :l h 
Db 109 GKTYFLKI HAPWEVLTTYAEVLN I KMPLKADDLTDESENLLS HMLKPFKLP PE 161 

Qy 223 V PPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEK-KNLLGI 278 

I I : I : : I I : I I I : : I I I : I : : I : : I I : : -I I I : I II 
Db 162 VMSPEPDYETAPFRKDKQELFRIED-KEKFFTPSTRNRIVYYILSRCHYGEEEGKKKFGI 220 

Qy 27 9 HQLLAEGVLSAAFPLHDGPF — KTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVR 336 

:M I I = I I I I : II I :l |: III:: :: :l 

Db ' 221 KRLLNNGSYLDAYPLHDCRYWKKTD ERSCERYTLYSHWAKFTRFYKEQPLDLIR 274 

Qy 337 RYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELC — GSKDSFEMC 394 

: I : II I : : I I I I I I I I I I I I I I I 111 : I : : I : I I II 
Db '275 KYYGEKIGIYFAWLGFYTEMLFYAAWGFFCFLYGWITMDDSISSKEICDPGIGGQIIMC 334 

Qy 395 PLC-LDCP FWLL S S AC AL AQAGRL FD H GGT VF FS L FMALWAVLL LE YWKRKS AT LAYRWD 4 53 

III I I I I : I I : I : I I : I : I I : : I I : I I I I : I I I : I I I I I 
Db 335 PLCDKRCEFWRLNSTCEPSQYSHMFDNVATLFFAIFMGIWVTLFLEFWKRRQARLEYEWD 394 

Qy 4 54 CSDYEDTEE — RPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAW 511 

|:|: :: : ||:: I I I : I I I I I I I : I I :: :::: 

Db 395 LVDFEEEQQQLQLRPE YEAKCT DKKKN P VT QEME PYLP PS SKAVRFCFSGAT VL FWI S L I 4 54 

Qy 512 VMCLVS 1 1 LYRA- IMAI WSRSGNTLLAAWASRI ASLTGSWNLVFILILSKIY 564 

: : : : I I : 1 I : I I II : I : I I I : I : I : I I : : I 

Db 455 IASI IAIIVYRLWVYAAFASIMENNLTLEPVRNLLTPQLATSVTASVLNFITIMILNFLY 514 

Qy 565 VSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLF 624 

: I : I I : I I :: I : I : I : I : I I I I I : I I I I : I I I I I : I I I I I : I II 
Db 515 .ERVAIWITDMEI PRTHLEYENRLTMKMFLFQFVNYYSSCFYVAFFKGKFVGYPADYTYLF 574 

Qy 625 GV-RNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGA 683 

I Mil! I I I I I I : I : : I I I I : I : I I : I I I : I I I 

Db 57 5 GKWRNEECDPAGCLIELTTQLTIVMAGKQIWGNIQEAFVPWTWNW LKRRKARN 627 

Qy 684 SAGASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTI FVAACPLAPLFALLNNWVEIR 741 

II I : I III I I I I II : I I I I : I : I I I : I I I I I I I I I I : I I I 

Db 628 HPENLYSRWEQDGDLQTFGGLGLFYEYLEMWQFGFITLFVASFPLAPLLALLNNILEIR 687 

Qy 742 LDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRA — YYRWT 799 

: I : I :: : | I I |. : I I I : I I I I : I : I : : I I I : : I I : I I : I I I I : I 
Db 688 VDSWKLTTQFKRPVAAKAHSIGVWQEILNGIAILSVVTNAFIVAFTSDMIPRLVYYYAYT 747 

Qy 800 RAHD — LRGFLNFTLARAPSSFAAAHNR TCRYRAFRDDDGH YSQ 841 

: I : I : : : : I I I : : I I I t : I I I I 

Db 748 QDKDMPMSGYISSSL SIFNVTDFKEQSMPTKNDMNVLSCRYRDYRYPPGHGKEYQV 803 

Qy 842 T — YWNLLAI RLAFVI VFEHWFSVGRLLDLLVPD I PESVE IKVKREYYLAKQALAENEV 899 

I I I : : I I :: I I : I : II I I I I : I : I I I I I : : I I I I : I : : I I I : 
Db '804 TMQYWH I LAAKMAFI I IMEHWFLVKFFVAWL I PDI PSEVKARVKREKFLTQKI LHEYEL 863 

RESULT 14 
Q9VTS0_DROME 

ID Q9VTS0__DROME PRELIMINARY; PRT; 1219 AA. 

AC Q9VTS0; 

DT 01-MAY-2000, integrated into UniProtKB/TrEMBL . 

DT 01-MAY-2000, sequence version 1. 

DT 07-FEB-2006, entry version 25. 

DE CG6938-PA. 

GN ORFNames=CG6938, Dmel_CG6938; 

OS Drosophila melanogaster (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota; Diptera; Brachycera; Muscomorpha ; 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBI_TaxID=7227; 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=201 96006; PubMed=10731132; DOI=10 . 1 126/science . 287 . 54 61 . 2 185; 

RA Adams M.D., Celniker S.E., Holt R.A., Evans C.A., Gocayne J.D., 

RA Amanatides P.G., Scherer S.E., Li P.W., Hoskins R.A., Galle R.F., 
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RA George R.A., Lewis S.E., Richards S., Ashburner M., Henderson S.N., 

RA Sutton G.G., Wortman J.R., Yandell M.D., Zhang Q. , Chen L.X., 

RA Brandon R.C, Rogers Y.-H.C, Blazej R.G., Champe M. , Pfeiffer B.D., 

RA Wan K.H., Doyle C, Baxter E.G., Helt G., Nelson C.R., Miklos G.L.G., 

RA Abril J.F., Agbayani A. , An H.-J., Andrews-Pf annkoch C, Baldwin D., 

RA Ballew R.M., Basu A. , Baxendale J., Bayraktaroglu L., Beasley E.M., 

RA Beeson K.Y., Benos P.V., Berman B.P., Bhandari D., Bolshakov S., 

RA Borkova D . , Botchan M.R., Bouck J., Brokstein P., Brottier P., 

RA Burtis K.C., Busam D.A., Butler H. # Cadieu E . , Center A., Chandra I., 

RA Cherry J.M., Cawley S., Dahlke C, Davenport L.B., Davies P., 

RA de Pablos B., Delcher A., Deng Z., Mays A.D., Dew I., Dietz S.M., 

RA Dodson K. , Doup L.E., Downes M., Dugan-Rocha S., Dunkov B.C., Dunn P., 

RA Durbin K.J., Evangelista C.C., Ferraz C, Ferriera S., Fleischmann W. , 

RA Fosler C, Gabrielian A.E., Garg N.S., Gelbart W.M., Glasser K. , 

RA Glodek A., Gong F., Gorrell J.H., Gu Z., Guan P., Harris M. , 

RA Harris N.L., Harvey D.A., Heiman T.J., Hernandez J.R., Houck J., 

RA Hostin D., Houston K.A., Howland J. J., Wei M.-H., Ibegwam C, 

RA Jalali M. , Kalush F., Karpen G.H., Ke Z., Kennison J. A., Ketchum K.A. , 

RA Kimmel B.E., Kodira CD., Kraft C, Kravitz S., Kulp D., Lai Z., 

RA Lasko P., Lei Y. , Levitsky A. A., Li J.H., Li Z., Liang Y., Lin X., 

RA Liu X., Mattei B., Mcintosh T.C., McLeod M.P., McPherson D., 

RA Merkulov G. , Milshina N.V., Mobarry C, Morris J., Moshrefi A., 

RA Mount S.M., Moy M., Murphy B., Murphy L., Muzny D.M., Nelson D.L., 

RA Nelson D.R., Nelson K.A., Nixon K. # Nusskern D.R., Pacleb J.M., 

RA Palazzolo M., Pittman G.S., Pan S., Pollard J., Puri V., Reese M.G., 

RA Reinert K . , Remington K. , Saunders R.D.C., Scheeler F. , Shen H., 

RA Shue B.C., Siden-Kiamos I., Simpson M., Skupski M.P., Smith T . , 

RA Spier E . , Spradling A.C, Stapleton M., Strong R., Sun E. , 

RA Svirskas R. , Tector C, Turner R. , Venter E. , Wang A.H., Wang X., 

RA Wang Z.-Y., Wassarman D.A., Weinstock G.M., Weissenbach J., 

RA Williams S.M., Woodage T., Worley K.C., Wu D., Yang S., Yao Q.A., 

RA Ye J., Yeh R.-F., Zaveri J.S., Zhan M. , Zhang G. , Zhao Q., Zheng L., 

RA Zheng X.H., Zhong F.N., Zhong W., Zhou X., Zhu S., Zhu X., Smith H.O., 

RA Gibbs R.A., Myers E.W., Rubin CM., Venter J.C.; 

RT "The genome sequence of Drosophila melanogaster."; 

RL Science 287:2185-2195(2000). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=22426065; PubMed=12537568 ; 

RA Celniker S.E., Wheeler D.A., Kronmiller B., Carlson J.W., Halpern A., 

RA Patel S., Adams M . , Champe M. , Dugan S.P., Frise E. , Hodgson A., 

RA George R.A., Hoskins R.A., Laverty T., Muzny D.M., Nelson C.R., 

RA Pacleb J.M., Park S., Pfeiffer B.D., Richards S., Sodergren E.J., 

RA Svirskas R. , Tabor P.E., Wan K. , Stapleton M. , Sutton G.C, Venter C. , 

RA Weinstock G., Scherer S.E., Myers E.W., Gibbs R.A., Rubin G.M.; 

RT ."Finishing a whole-genome shotgun: release 3 of the Drosophila 

RT melanogaster euchromatic genome sequence."; 

RL Genome Biol. 3 : RESEARCH0079-RESEARCH0079 (2002) . 

RN [3] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=22426070; PubMed=12537573; 

RA Kaminker J.S., Bergman CM., Kronmiller B., Carlson J.W., Svirskas R. , • 

RA Patel S., Frise E., Wheeler D.A., Lewis S.E., Rubin G.M., 

RA Ashburner M., Celniker S.E.; 

RT "The transposable elements of the Drosophila melanogaster euchromatin: 

RT a genomics perspective."; 

RL Genome Biol. 3 : RESEARCH0084 . 1-RESEARCH0084 .20 { 2002 ) . 

RN [4] 

RP NUCLEOTIDE SEQUENCE. 

RX MEDLINE=224 26069; PubMed=12537572 ; 

RA Misra S., Crosby M.A., Mungall C.J., Matthews B.B., Campbell K.S., 

RA Hradecky P., Huang Y. , Kaminker J.S., Millburn G.H., Prochnik S.E., 

RA Smith CD., Tupy J.L., Whitfield E.J., Bayraktaroglu L . , Berman B.P., 

RA Bettencourt B.R., Celniker S.E., de Grey A. D.N. J., Drysdale R.A., 

RA Harris N.L., Richter J., Russo S., Schroeder A. J., Shu S.Q., 

RA Stapleton M., Yamada C, Ashburner M. , Gelbart W.M., Rubin G.M., 

RA Lewis S . E . ; 

RT "Annotation of the Drosophila melanogaster euchromatic genome: a 

RT systematic review."; 

RL Genome Biol. 3 : RESEARCH0083 . 1-RESEARCH0083 .22 ( 2002 ) . 

RN [5] 

RP NUCLEOTIDE SEQUENCE. 

RG Berkeley Drosophila Genome Project; 

RA Celniker S., Carlson J., Wan K. , Pfeiffer B., Frise E . , George R. , 

RA Hoskins R., Stapleton M., Pacleb J., Park S., Svirskas R., Smith E., 

RA Yu C, Rubin G.; 
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RT "Drosophila melanogaster release 4 sequence."; f 

RL Submitted (MAR-2000) to the EMBL/GenBank/DDBJ databases. 

RN [6] 

RP NUCLEOTIDE SEQUENCE . 

RG FlyBase ; 

RL Submitted (JAN-2006) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License • 



CC 

DR EMBL; AE00354 3; AAF49976.1; -; Genomic_DNA. 

DR FlyBase; FBgn0036235; CG6938. 

DR InterPro; IPR007632; DUF590. 

DR InterPro; IPR003006; Ig_MHC. 

DR InterPro; IPR001563; Peptidase_S10. 

DR Pfam; PF04547; DUF590; 1. 

DR PROSITE; PS00560; CARBOXYPEPT_SER_HIS; UNKN0WN_1. 

DR PROSITE; PS00290; IG_MHC; UNKN0WN_1 . 

SQ SEQUENCE 1219 AA; 139592 MW; F5ABEB2726ED82A6 CRC64; 



Query Match 29.2%; Score 1445; DB 2; Length 1219; 

Best Local Similarity 35.6%; Pred. No. 1.2e-109; 

Matches 342; Conservative 165; Mismatches 332; Indels 122; Gaps 27; 



Qy 


35 


AERWAMTSETSSGSHCARSRML RRRAQEEDSTVL IDVS PPEAEKRGSY 

1:1 : 1 111:1 |:| : |:| Ml 
ADRVNQSYEVMESSH SNVLPDQFGYRQLI PTERKASDTASSV SGSY 


82 


Db 


249 


294 


Qy 


83 


GSTAHASEP GGQQAAACRAGSPAKP RIADFVLVW-EEDLKLDRQ 

: II : ' M : 1 : 1 1 II 1 Mil :. : 
YGSRKASKSNSLGGESGDERRVSKQDREGLDPESLMFRDGRRKVDMVLAWEEEDLGVMTE 


125 


Db 


295 


354 


Qy 


126 


QDSAARDRTDMHRTWRET FLDNLRAAGLCVDQQD-VQDGNTTVHYALLSASWAVLCYYAE 
:: II 1 :|::|| II |: :| 1 1 : : 1 : II 
AEAKRRDN — RRSFMENLIKEGLEVELEDKSQSFNEKTFFLKIHLPWRLETRLAE 


184 


Db 


355 


407 


Qy 


185 


DLRLKXP LQELPNQASNWSAGLLAWLGIPNVLLEWPDVPP 

: 1 1 1 1 1: : 1 1 : : 1 1 1 
VMNLKL PVKRFI T I SVKP SWDEEN WLRNMQYWKDVWQR-LT KKI QLDQT LLE GET 


225 


Db 


408 


462 


Qy 


226 


EYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEG 
: : 1 : 1 : 1 : 1 1 1 1 : 1 :: ::| : I I : :: 1 1 : 1 : : 1 
TFKAATANGNPEEQFIVKD-RATAFTSAQRSLMVMQVLIRTPFDESDRS — GIRRLMNDG 


285 


Db 


463 


519 


Qy 


286 


VLSAAFPLHDGPFKTPPEGPQAPRLN-QRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVA 

1111:1 : : 1 : : : 1:11:1 II : 1 1 III 11:111:1:1 
iiii*i • • i • • • i • i 1 • i ii • i i iii i i • i i i • i • i 

TYLGCFPLHEGRY DRPHSSGISLDRRVLYQTWAHPSQWYKKQPLCLVRKYFGDKIA 


34 4 


Db 


520 


575 


Qy 


345 


LYFAWLGFYTGWLLPAAWGT LVFLVGC FLVFSD — I PTQELCG — SKDS FEMC PLC - LD 
1 1 1 1 1 1 1 1 1 1 : 1111111:1 : 1 : 1 :: 1 : 1 : Mill 
LYFCWLGFYTEMLVYPAWGTLCFIYGLATLESEDNTPSKEICNEYGTGNITLCPLCDKA 


399 


Db 


576 


635 


Qy 

Db 


400 
636 


CPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYED 
1 : 1 1 : 1 :: III: 1 1 1 1 ::| 1 : 1 1 1 1 M 1 1 : 1 : 1 1 : 1 
CSYQRLSESCLFSRLTYLFDNPSTVFFAIFMSFWATTFLELWKRKQSVLWEWDLHNV-D 


459 
694 


Qy 


460 


TEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAVVVMCLVSII 
: 1 1 1 : 1 : 1 1 1 : 1 1 1 1 1 : 1 : 1 : : : 1 : : 1 1 : : : 1 
MDEENRPE FETNATT FRMN PVT REKE PYMSTWNRS I RFVITGSAVL FM I S WLSAVLGT I 


519 


Db 


695 


754 


Qy 


520 


LYRAIMAIWSRSGNTLLAAWASRIASLTGSVWLVFILILSKIYVSLAHVLTRWEMHRT 
III : 1 : 1 : 1 1 : 1 : : : 1 1 1 1 : 1 1 : : 1 1 : 1 II 1 II 
LYRITLVSVIYGGGGFFVKEHAKLFTSVTAALINLWIMILTRIYHRMAIKLTNLENPRT 


579 


Db 


755 


814 


Qy 

Db 


580 
815 


QTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHT LFGVRNEECAAGG 

1 : : 1 1 : : 1 1 : 1 1 : 1 : 1 1 1 1 1 : II 1 1 1 1 1 1 1 Ml: 1 : : 1 : 1 1 
HTEYEDSYTFKI FFFEFMNFYSSLIYIAFFKGRFFDYPGDDQARKSEFFRLKNDICDPAG 


635 
874 


Qy 


636 


CLIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDD 
1 1 1 1 : 1 : 1 1 II II M 1 1 1 1 II: : 1 : : 1 III 
CLSELCIQLAIIMVGKQCWNNFMEYLFPKFWNWWR QRKHKQATKDESHLHMAWEQD 


695 


Db 


875 


930 


Qy 

Db 


696 
931 


YELV-PCE-GLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRR 
1 : 1 1 M II 1 1 1 : 1 1 : M 1 1 : 1 1 M 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 : 1 1 
YHMQDPGRLALFDEYLEMILQYGFVTLFVAAFPLAPLFALLNNVAEIRLDAYKMVTQARR 


753 
990 



http://es/ScoreAccessWeb/GetItem.action?AppId=10552515&seqId=775630&ItemName... 11/17/2006 



Qy 

Db 



754 PVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYR — WTRAHDLRGFLNFT 811 

1:111 Mil I: II :|: I I : I I I I : : I : : I I I : I I |: :: III:: : 
991 PLAERVEDIGAWYGILRI ITYTAWSNAFVIAYTSDFI PRMVYKFVYSETHTLAGYIEHS 1050 



Qy 812 LA RAPSSFAAAHNRTCRYRAFRDDDGHY SQTYWNLLAIRLAFVIVF 857 

I: :l : :|: I I I I : : I ' I I I I I : I I 

Db 1051 LSIFNTSDYKEEWGASVSEKDPDTCQYRGYRNGPKDYEPYGLSPHYWHVFAARLAFVWF 1110 

Qy 858 EHWFSVGRLLDLLVPDI PESVEI KVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELS 917 

Mill : :: ::||:| |: :::|| llhl : :| I I :: 

Db 1111 EHWFVITGIMQFIIPDVPSEVKTQMQREQLLAKEAKYQ HGI KRAQGDSQDIM 1163 

Qy 918 S 918 

I 

Db 1164 S 1164 



RESULT 15 
Q2M0Y5_DROPS 

ID Q2M0Y5_DROPS PRELIMINARY; PRT; 1235 AA. 

AC Q2M0Y5; 

DT 21-FEB-2006, integrated into UniProtKB/TrEMBL. 

DT 21-FEB-2006, sequence version 1. 

DT 21-FEB-2006, entry version 1. 

DE GA19969-PA (Fragment). 

GN Name=Dpse\GA19969; 0RFNames=Dpse_GA19969; 

OS Drosophila pseudoobscura (Fruit fly) . 

OC Eukaryota; Metazoa; Arthropoda; Hexapoda; Insecta; Pterygota; 

OC Neoptera; Endopterygota ; Diptera; Brachycera; Muscomorpha; ' 

OC Ephydroidea; Drosophilidae; Drosophila. 

OX NCBIJTaxID=7237; . 

RN [1] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN-MV2-25; 

RX PubMed=15632085; DOI=10 . 1 101/gr . 3059305; 

RA Richards S., Liu Y. , Bettencourt B.R., Hradecky P., Letovsky S., 

RA Nielsen R., Thornton K. , Hubisz M.J., Chen R., Meisel R.P., 

RA Couronne 0., Hua S., Smith M.A., Zhang P., Liu J. , Bussemaker H.J., 

RA van Batenburg M.F., Howells S.L., Scherer S.E., Sodergren E., 

RA Matthews B.B., Crosby M.A., Schroeder A.J., Ortiz-Barrientos D., 

RA Rives CM., Metzker M.L., Muzny D.M., Scott G., Steffen D., 

RA Wheeler D.A., Worley K.C., Havlak P., Durbin K.J., Egan A., Gill R. , 

RA Hume J., Morgan M.B., Miner G., Hamilton" C, Huang Y., Waldron L. , 

RA Verduzco D., Clerc-Blankenburg K.P., Dubchak I., Noor M.A.F., 

RA Anderson W. , White K.P., Clark A.G., Schaeffer S.W., Gelbart W., 

RA Weinstock G.M., Gibbs R.A.; 

RT "Comparative genome sequencing of Drosophila pseudoobscura: 

RT chromosomal, gene, and cis-element evolution."; 

RL Genome Res. 15:1-18(2005). 

RN [2] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=MV2-25; 

RG FlyBase; 

RL Submitted (JUN-2005) to the EMBL/GenBank/DDBJ databases. 

RN [3] 

RP NUCLEOTIDE SEQUENCE. 

RC STRAIN=MV2-25; 

RG Human Genome Sequencing Center; 

RA Richards S., Liu Y., Bettencourt B.R., Hradesky P., Letovsky S., 

RA Chen R., Smith M.A., Howells S.L., Scherer S.E., Sodergren E. , 

RA Rives CM., Metzker M.L., Munzy D.M., Wheeler D.A., Worley K.C, 

RA Havlak P., Durbin K.J., Egan A., Gill R., Hume J., Morgan M.B., 

RA Huang Y., Waldron L . , Verduzco D., Blankenburg K.P., Adams C, 

RA Allen C, Allen H., Anyalebechi V., Asomugha C, Bellard T., 

RA Bhuchar V., Biswalo K . , Blair J., Blomstrom D., Burrell K. , 

RA Calderon E., Cardenas V., Carter K. , Cavazos I., Ceasar H., Chacko J., 

RA Chavez D., Chu J., Cockrell R., Cox C, Coyle M., Davila M. , Davis C, 

RA Davy-Carroll L. , De A., Delgado 0., Denson S., Deramo C, Dinh H., 

RA Eaves K. , Escotto M., Eugene C, Falls T., Fernandez S., Flagg N., 

RA Forbes L., Garner T., Garza M . , Ghose S., Grady M., Hamilton C, 

RA Hernandez J., Hines S., Hogues M. , Hollins B., Idlebird D., Imo K. , 

RA Jimenez A., Johnson B., Jolivet A., Kelly S., King L., Kisamo H., 

RA Kovar C, Lebow H., lee K., LeGall F. , Lewis L., Li Z., London P., 

RA Lopez J., Lozado R. , Malloy K. , Martinez E. , Mercadao C, Miner G., 

RA Minja E., Moore S., Nanavati A., Ngo R., Nguyen N., Nwaokelemeh 0., 

RA Okwuonu G., Parks K. , Pasternak S., Patel B . , Paul H., Payne C, 
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RA Poindexter A., Primus E . , Pu L.-L., Puazo M., Quiroz J., Rabata D., 

RA Reigh R., Ruiz S., Sanders W. , Sisson I., Sorelle R., Taylor C, 

RA Taylor T., Thomas N., Trejos Z., Usmani K., Vera V., Villasana D . , 

RA Wang S., Warren J., Warren R. , White F. , Wleczyk R. , Wright R., 

RA Noor M.A.F., Schaeffer S.W., Gelbart W. , Weinstock G.M., Gibbs R.A., 

RA Weinstock G., Gibbs R.; 

RL Submitted (JUN-2005) to the EMBL/GenBank/DDBJ databases. 

CC 

CC Copyrighted by the UniProt Consortium, see http://www.uniprot.org/terms 

CC Distributed under the Creative Commons Attribution-NoDerivs License 

CC 

DR EMBL; CH379069; EAL30791.1; -; Genomic_DNA. 

FT NONJTER 1235 1235 

SQ SEQUENCE 1235 AA; 141719 MW; 755BBB05AFEE8D11 CRC64; 



Query Match 29.2%; Score 1443; DB 2; Length 1235; 

Best Local Similarity 35.6%; Pred. No. 1.8e-109; 

Matches 350; Conservative 169; Mismatches 327; Indels 136; Gaps 32; 



Qy 


17 


PTLC PAVRTGLYCRDQAHA ERWAM-TSETSSGS HCARS 

1 1 1 1 ! : : 1 : III 1: : III: 
PTLCGHRTSSEDGVNQSYEIMESSHSNVLPDQFGYRQLLPTERKASDTASSISGSYYG — 


53 


Db 


256 


313 


Qy 


54 


RM LRRRAQE E DSTVLIDVSPP E AE KRGSYGSTAHASEP GGQQ AAAC RAG SPAKPRIADFV 
1:1 : :| : 1 J:| 1 1 :: II ill 


113 


Db 


314 


SRKASKSNSI MGDSENERRIS KQDREGLDPESLMFRDGR RKVDMV 


358 


Qy 


114 


LVW-EEDLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQD-VQDGNTTVHYAL 
1 i llll : : :: II 1 :|::|| II |: :| I I : 
LAWEEEDLGVMTEAEARRRD LRRS FMENLVKEGLEVELEDKSQS FNEKTFFLK 


.171 


Db 


359 


411 


Qy 


172 


LSASWAVLCYYAEDLRLKLPLQEL — PNQASNWSAGLLAWLGIPNVLLEWPDVPPEYY- 
: 1 : 1 1 ■: 1 1 1 1 : : : : 1 ■ 1 1 : I I : I : 
IHLPWRLETRLAEVMNLKLPIKRFITISVKPSWDE ENWLRNV QYWR 


228 


Db 


412 


4 58 


Qy 


229 


SCRFRV NKLPRFLGSDNQDTFFTSTKRHQILFEILA 

1 : 1 : 1 : 1 : 1 1 1 1 : 1 :: :: I 
EVWQRLTKKIQLDQSLLEGETT FKAATANGNPEEQFIVKD-RATAFTSAQRSLMVMQVLI 


264 


Db 


459 


517 


Qy 


265 


KTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLN-QRQVLFQHWARW 
: 1 1 1 : : 1 1 : 1 : : 1 1 1 1 1 : 1 : : 1 : : : 1 : 1 1 : 1 II 
RTPYDETDRS — GIRRLMNDGTYLGCFPLHEGRY DRPHSSGISLDRRVLYQTWAHP 


323 


Db 


518 


571 


Qy 


324 


GKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSD— IPT 
: 1 1 1 1 1 1 1 : 1 1 1 : 1 : 1 1 1 1 1 1 1 1 1 1 1 : : 1 1 1 1 1 I: I : I : 1 : 
SQWYKKQPLCLVRKYFGDKIALYFCWLGFYTEMLVYPSWGTLCFIYGLATLESEDNTPS 


381 


Db 


572 


631 


Qy 


382 


QELCG — SKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLL 
: 1 : 1 : Mill ! : 1 1 : 1 : : III: 1 1 1 1 :: 1 1 : 1 1 1 
KEICNEFGTGNITLCPLCDKACSYQRLSESCLFSRLTYLFDNPSTVFFAIFMSFWATTFL 


438 


Db 


.632 


691 


Qy 


439 


EYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRML 
1 Mil : 1 : II : 1 :| 11:1 :| Ihl 1 III :|| 1 : 
ELWKRKQSVLVWEWDLHNV-DMDEENRPEFETNATTFRMNPVTREKEPYMSTWNRAIRFV 


498 


Db . 


692 


750 


Qy 


499 


AGSWIWMVAVVV^CLVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSVVNLVFIL 
: : 1 : : 1 1 : : : Mil : 1 : 1 : 1 1 : 1 : : : I I I I : 
VTGSAVLFMISWLSAVLGTILYRITLVSVIYGGGGFFVKEHAKLFTSVTAALINLWIM 


558 


Db 


751 


810 


Qy 


559 


ILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPG 
1 1 :: 1 1 : 1 II 1 II 1 : : 1 1 : : 1 1 : 1 1 : 1 :| 1 1 1 1 : 1 1 1 1 1 1 1 1 1 II! 
ILTRIYHRMAIRLTNLENPRTHTEYEDSYTFKIFFFEFMNFYSSLIYIAFFKGRFFDYPG 


618 


Db 


811 


870 


Qy 


619 


N YHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRL 

: 1 :: 1 : 1 Mill : 1 : 1 1 1 1 1 1 II llll I I : 
DEGARRSEFFRLKNDICDPAGCLSELCIQLAI IMVGKQCWNNFMEYLFPKFWNWWR 


674 


Db 


871 


926 


Qy 

Db 


675 
927 


RS KKRKAGAS AGAS QG PWED D Y EL V- PC E - GL FD E YLEMVLQ FG FVT I FVAAC P LA PL FA 
: 1 :: 1 UN: I 1 1 1 1 1 1 1 1 : 1 1 : 1 1 I 1 : 1 1 1 1 1 1 1 1 1 1 1 
QRKHKQATKDESHLHMAWEQDYHMQDPGRLALFDEYLEMILQYGFVTLFVAAFPLAPLFA 


732 
986 


Qy 


733 


LLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLP 
llll llllll 1 1 : 111:111 MM 1: II :|: 1 1 : 1 1 1 1 : : 1 : : 1 1 1 : 1 
LLNNVAEI RLDAYKMVTQARRPLAERVEDI GAWYGI LRI ITYTAVVSNAFVI AYTSDFI P 


792 


Db 


987 


1046 


Qy 


793 


RAYYR — WTRAHDLRGFLNFTLARAPSS FAAAHNR TCRYRAFRDDDGHY — 


839 
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I I: :: III:: :|: :| |: I I I : I I- : I : I ' 

Db 1047 RMVYKFVYSETHTLAGYIEHSLSIFNTSDYKEEWGASVSERDPDTCQYRGYRNGPKDYDA 1106 

Qy 840 SQTYWNLLAIRLAFVIVFEHVVFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAE 896 

I I I : : I I I I I I : I I I I I II : : : : : I I : I I : : : : I I III:! : 
Db 1107 YGLSPHYWHVFAARLAFVWFEHWFVITGIMQFIIPDVPSEVKTQMQREQLLAKEAKYQ 1166 

Qy 897 NEVLFGTNGTKDEQPKGSELSS 918 

:| I I :: I 
Db 1167 HGIKRAQGDSQDIMS 1181 

Search completed: October 27, 2006, 20:28:46 
Job time : 327 sees 



SCORE 1.3 BuildDate: 12/06/2005 
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This page gives you Search Results detail for the Application 10552515 and Search Result us-10- 

552-515-1. rag. 

start 

Go Back to previous page 



GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 

Run on: October 27, 2006, 20:18:31 ; Search time 205 Seconds 

(without alignments) 
2080.892 Million cell updates/sec 



Title: 

Perfect score: 
Sequence : 



US-10-552-515-1 
4950 

1 MRMAATAWAGLQGPPLPTLC SELSSHWTPFTVPKASQLQQ 933 



Scoring table: . BLOSUM62 

Gapop 10.0 f Gapext 0.5 



Searched: 



2589679 seqs, 457216429 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



Database : 



2589679 



A_Geneseq_8 :* 

1: geneseqpl980s:+ 

2: geneseqpl990s:* 

3: geneseqp2000s:* 

4: geneseqp2001s:* 

5: geneseqp2002s:* 

6: geneseqp2003as : * 

7: geneseqp2003bs : * 

8: geneseqp2004s: * 

9: geneseqp2005s: * 
10: geneseqp2006s: * 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result. being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



% 

Result Query 

No. Score Match Length DB ID 



Description 



1 


4950 


100 


0 


933 


8 


ADT77664 


Adt77664 


Splice va 


2 


4531.5 


91 


5 


885 


9 


AEB13426 


Aebl3426 


Human pro 


3 


4364 .5 


88 


2 


843 


9 


AEB13424 


Aebl3424 


Human pro 


4 


3736 


75 


5 


898 


4 


ABG15488 


Abgl5488 


Novel hum 


5 


1531.5 


30 


9 


920 


7 


ADB64420 


Adb64420 


Human pro 


6 


1511.5 


30 


5 


920 


6 


ABP58666 


Abp58666 


Human dih 


7 


, 1504 


30 


4 


981 


8 


ADK52114 


Adk52114 


Human ato 
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8 


14 88 


30 


. 1 


960 


1U 


nrri 1 1 AO 
AhGl 1 1 4 Z 


Aeglll42 Human tra 


9 




O Q 


. y 


Q A C\ 

o4 U 


1U 


AhGl 114b 


Aeglll4 6 Human tra 


10 


1 /I £ /I 

14 b4 


O Q 


. b 


1 C\f\ "3 

IUUj 


1 


ALKj4 0 Za U 


Adg 48 280 Human ret 


11 


1 4 1 C 

14 4 0 


i y . 




ic iy 


A 

4 


a DD CO Q 1 O 


ADD62812 Drosopnil 


12 


14 . D 


28 


. 3 


y iu 


-7 


AUL4 Zo 34 


Adc42854 REMAP pro 


13 


1369 . 5 


27 , 


, 7 


1075 


4 


ABbooy y j 


Abb65993 Drosopnil 


14 


1367.5 


27 


. 6 


712 


10 


AEG1 114 5 


Aeglll4 5 Human tra 


15 


1199.5 


24 . 


. 2 




4 


ABBbOUz Z 


ADD65022 Drosopnil 


16 


1 154 


23 . 


. 3 


o yo 


/ 


AL)dd4 jo / 


Adb64 38 7 Human pro 


17 


1061 . 5 


2 1 , 


. 4 


594 


4 


AAo y 2 D j / 


Aab92637 Human pro 


18 


1061 . 5 


21 . 


. 4 


594 


5 


ABP4 3811 


Abp4 3811 FLJ102 61 


19 


1061 . 5 


21 . 


. 4 


594 


8 


ADJ754 2 9 


Adj75429 Marker ge 


20 


1061 . 5 


2 1 . 


. 4 


594 


8 


ADN04 84 8 


Adn04 84 8 Antipsori 


21 


1061 .5 


2 1 . 


, 4 


594 


10 


ALG1 114 3 


Aeglll4 3 Human FLJ 


22 


1037 . 5 


21 . 


. 0 


782 


7 


ji r\rp AT rtAT 

ADT95905 


Adt95905 Colon can 


23 


1037 . 5 


21 . 


, 0 


782 


7 


ADX4 2387 


Adx4 2387 Human col 


24 


1037 . 5 


21 . 


, 0 


782 


8 


ADQ96288 


Adq96288 T cell ac 


25 


1037 . 5 


21 . 


, 0 


782 


8 


ADQ96104 


Adq96104 T cell ac 


26 


912.5 


18 . 


. 4 


475 


7 


ADB64 962 


Adb64 962 Human pro 


27 


905 


18 . 


. 3 


642 


7 


ADM05798 


Adm05798 Human pro 


28 


905 


18 . 


, 3 


642 


9 


AEC88728 


Aec88728 Human cDN 


29 


905 


18 . 


. 3 


642 


10 


AEG1 1144 


Aeglll44 Human FLJ 


30 


819.5 


1 6 . 


. 6 


443 


5 


ABP4 1785 


Abp4178 5 Human ova 


31 


817.5 


16 . 


. 5 


179 


7 


AA029613 


Aao29613 Human Nov 


5Z 


"7 Q A C 
/o4 . D 


1 D . 


Q 

. O 


ion 

oyu 


D 




Abb90382 Human pol 


33 


735 


1 4 . 


. 8 


139 


5 


AAE24 066 


Aae24066 Human pro 


34 


722 . 5 


14 . 


. 6 


360 


4 


AAM40391 


Aam40391 Human pol 


35 


711.5 


14 . 


4 


346 


8 


ADP29628 


Adp29628 Human sec 


36 


695 . 5 


14 . 


. 1 


608 


8 


ADQ96298 


Adq96298 T cell ac 


37 


695 . 5 


14 . 


, 1 


608 


8 


ADQ96286 


Adq96286 T cell ac 


38 


684 . 5 


13 . 


8 


483 


7 


ADM05305 


Adm05305 Human pro 


39 


684 . 5 


1 3 . 


8 


483 


8 


ADQ96290 


Adq96290 T cell ac 


40 


684 . 5 


13. 


8 


483 


9 


AEC88235 


Aec88235 Human cDN 


4 1 


656 . 5 


13, 


3 


314 


4 


AAM42177 


Aam4 217 7 Human pol 


42 


612 


12. 


.4 


339 


4 


AAB94837 


Aab94 837 Human pro 


43 


601 


12. 


1 


594 


8 


ADQ67527 


Adq67 527 Novel hum 


44 


594 .5 


12. 


0 


589 


4 


AAB92752 


Aab92752 Human pro 


45 


594.5 


12. 


0 


589 


5 


ABB97370 


Abb 97 370 Novel hum 



ALIGNMENTS 

RESULT 1 
ADT77664 

ID ADT77664 standard; protein; 933 AA. 
XX 

AC ADT77664; 
XX 

DT 13-JAN-2005 (first entry) 
XX 

DE Splice variant-novel gene expressed in prostate (SV-NGEP) polypeptide. 
XX 

KW Splice variant-novel gene expressed in prostate; SV-NGEP; human; 

KW prostate cancer; cytostatic; gene therapy; immunotherapy. 

XX 

OS Homo sapiens. 



XX 

FH Key Location/Qualifiers 

FT Domain 1. .345 

FT /label = Cytoplasmic 

FT Region 157. .933 

FT /note= "An immunogenic fragment comprising 8 consecutive 

FT amino acids that specifically binds to an antibody that 

FT specifixally binds to a polypeptide comprising amino 

FT acids 157-933 is referred to in Claim 1" 

FT Region 170. .178 

FT /note= "Epitope, predicted to bind HLA2-01" 

FT Region 215. .223 

FT /note= "Epitope, predicted to bind HLA2-01" 

FT Region 258. .266 

FT ./note= "Epitope, predicted to bind HLA2-01" 

FT Domain 346. .368 

FT /label - Transmembrane 

FT Domain 369. .421 

FT /label = External 
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FT /note= "Cell surface" 

FT Region 403. .411 

FT /note= "Epitope, predicted to bind HLA2-01" 

FT Domain 422. .441 

FT. /label = Transmembrane 

FT Region 427. .4 35 

FT /note= "Epitope, predicted to bind HLA2-01" 

FT Domain 442. .501 

FT /label = Cytoplasmic 

FT Domain 502. .524 

FT /label = Transmembrane 

FT Domain 525. .54 3 

FT /label = External 

FT . /note= "Cell surface" 

FT Domain 544. .566 

FT /label = Transmembrane 

FT Region 557. .565 

FT /note= "Epitope, predicted to bind HLA2-01" 

FT Region 562. .570 

FT /note^ "Epitope, predicted to bind HLA2-01" 

FT Domain 567. .586 

FT /label = Cytoplasmic 

FT Domain 587. .609 

FT /label = Transmembrane 

FT Domain 610. .714 

FT /label = External 

FT /note= "Cell surface" 

FT Domain 715. .737 

FT /label = Transmembrane 

FT Domain 738. .761 

FT /label = Cytoplasmic 

FT Domain . 762 . .784 

FT /label = Transmembrane 

FT Domain 785. .933 

FT /label = External 

FT /note= "Cell surface" 

FT Region 846. .854 

FT' /note=' "Epitope, predicted to bind HLA2-01" 
XX 

PN WO2004092213-A1. 
XX 

PD 28-OCT-2004. 
XX 

PF . 05-APR-2004; 2004WO-US010588 . 
XX 

PR 08-APR-2003; 2003US-04 61 399P . 
XX 

PA (USSH ) US DEPT HEALTH & HUMAN SERVICES. 

XX • 

PI Pastan I, Bera TK, Lee B; 

XX 

DR WPI; 2004-758338/74. 

DR N-PSDB; ADT77665. 
XX 

PT New Splice Variant-Novel Gene Expressed in Prostate polypeptide or 

PT encoding nucleic acid molecule for diagnosing, preventing or treating 

PT cancer, especially prostate cancer. 
XX 

PS Claim 1; SEQ ID NO 1; 88pp; English. 
XX 

CC The present sequence is the protein sequence of splice variant-novel gene 

CC expressed in prostate {SV-NGEP) . SV-NGEP is identical to NGEP from amino 

CC acid 1-157, diverging from amino acid 158. Expression analysis in 76 

CC normal and foetal tissues showed SV-NGEP to be strongly expressed only in 

CC a prostate sample. Claimed methods for detecting prostate cancer in a 

CC subject comprise: contacting the sample with an antibody that 

CC specifically binds a SV-NGEP polypeptide and detecting the formation of 

CC an immune complex; or detecting an increase in expression of SV-NGEP 

CC polypeptide or mRNA. Antibodies to an SV-NGEP polypeptide can be used to 

CC detect metastatic prostate cancer cells at locations other than the 

CC prostate. A claimed method for producing an immune response against a 

CC cell expressing SV-NGEP, for example in a subject with prostate cancer, 

CC comprises administering the polypeptide, or a polynucleotide encoding it, 

CC to produce an immune response that decreases growth of the prostate 

CC cancer. A claimed method for inhibiting the growth of a malignant cell 

CC that expresses SV-NGEP comprises culturing cytotoxic T lymphocytes (CTLs) 
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CC with SV-NGEP to' produce activated CTLs that recognise an NGEP expressing 

CC cell, and contacting the malignant cell with the activated CTLs. 

CC Alternatively, growth of a malignant cell is inhibited by contact with an 

CC antibody that specifically binds an SV-NGEP polypeptide, where the 

CC antibody is linked to an effector molecule {chemotherapeutic agent or 

CC toxin) that inhibits growth of the malignant cell. This may be performed 

CC in vivo. Kits for detecting an SV-NGEP polypeptide or polynucleotide in a 

CC sample are also claimed. 

XX 

SQ Sequence 933 AA; 



Query Match 100.0%; Score 4950; DB 8; Length 933; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 933; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 



Qy 


l 


MRMAATAWAGLQGPPLPTLCPAVRTGLYCRDQAHAERWAMTSETSSGSHCARSRMLRRRA 


60 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 




Db 


l 


MRMAATAWAGLQGPPLPTLCPAVRTGLYCRDQAHAERWAMTSETSSGSHCARSRMLRRRA 


60 


Qy 


61 


QEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDL 


120 






1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 I 1 1 1 1 1 1 1 II 1 




Db 


61 


QEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDL 


120 


Qy 


121 


KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 .1 1 M 




Db 


121 


KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 


180 


Qy 


181 


YYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRF 


240 






1 M 1 1 1 1 1 1 1 1 1 1 1 II 1 I 1 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


YYAEDLRLKLPLQELPNQASNWSAGLLAWLGI PNVLLEWPDVPPEYYSCRFRVNKLPRF 


240 


Qy 


241 


LGSDNQDT FFTSTKRHQI LFE I LAKT PYGHEKKNLLGI HQLLAEGVLSAAFPLHDGPFKT 


300 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 




Db 


241 


LGSDNQDT FFTSTKRHQI LFE I LAKT PYGHEKKNLLGI HQLLAEGVLSAAFPLHDGPFKT 


300 


Qy 


301 


PPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPA 


360 






1 1 i 1 1 1 1 1 1 1 1 1 1 i 1 1 II 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


PPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPA 


360 


Qy 


361 


AWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDH 


420 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


AWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDH 


420 


Qy 


421 


GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPI 


480 






1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II j 1 1 1 1 1 M 




Db 


421 


GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPI 


480 


Qy 


481 


TGED E P Y F PE RS RARRML AGS WI WMVAVWMC LVS 1 1 L YRAI MA I WS RSGNTL LAAW 


540 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 ! 1 1 1 1 1 




Db 


481 


TGEDEPYFPERSRARRMLAGSWIVVWAVAA/MCLVSIILYRAIMAIVVSRSGNTLLAAW 


540 


Qy 


541 


ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 


600 






1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 1 I 1 I I I I 1 I I 




Db 


54 1 


ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 


600 


Qy 


601 


SSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEV 


660 






1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 II 1 I I II II II 11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


601 


SSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEV 


660 


Qy 


661 


LI PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTI 


720 






1 1 1 1 1 1 1 1 1 I 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


661 


LI PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTI 


720 


Qy 


721 


FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 


780 






1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 




Db 


721 


FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 


780 


Qy 


781 


AFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYS 


840 






1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 II 1 1 1 1 




Db 


781 


AFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYS 


840 


Qy 


841 


QTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENEVL 


900 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


841 


QTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENEVL 


900 


Qy 


901 


FGTNGTKDEQPKGSELSSHWTPFTVPKASQLQQ 933 
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I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I | | 

Db 901 FGTNGTKDEQPKGSELSSHWTPFTVPKASQLQQ 933 



RESULT 2 
AEB13426 

ID AEB13426 standard; protein; 885 AA. 
XX 

AC AEB13426; 
XX 

DT 22-SEP-2005 (first entry) 
XX 

DE Human prostate specific polypeptide #2. 
XX 

KW Screening; diagnosis; drug delivery; prostate specific polypeptide; 

KW cancer; prostate tumor; cytostatic; neoplasm. 

XX 

OS Homo sapiens. 
XX 

PN WO2005062788-A2. 
XX 

PD 14-JUL-2005. 
XX 

PF 16-DEO2004; 2004WO-US04 24 06 . 
XX 

PR '22-DEC-2003; 2003US-0531809P . 
XX 

PA (AVAL-) AVALON PHARM INC. 
XX 

PI Weigle B, Ebner R; 
XX 

DR WPI; 2005-497793/50. 

DR N-PSDB; AEB13425. 
XX 

PT Novel isolated prostate specific polypeptide, useful for treating cancer, 

PT and identifying agent that modulates activity of cancer related gene. 
XX 

PS Claim 12; SEQ ID NO 5; 59pp; English. 
XX 

CC The invention relates to an isolated prostate specific polypeptide 

CC comprising one or more immunogenic fragments. The invention also relates 

CC to a method of identifying an agent that modulates the activity of a 

CC cancer related gene involving contacting a compound with a cell 

CC containing a gene under conditions promoting the expression of the gene, 

CC detecting a difference in expression of the gene relative to when the 

CC compound is not present and identifying an agent that modulates the 

CC activity of a cancer related gene, a method of identifying an anti- 

CC neoplastic agent involving contacting a cell exhibiting neoplastic 

CC activity with a compound first identified as a cancer related gene 

CC modulator using and determining a decrease in neoplastic activity after 

CC contacting, when compared to when the contacting does not occur, or 

CC administering an agent first identified to an animal exhibiting a cancer 

CC condition and detecting a decrease in cancerous condition, a method of 

CC determining the cancerous status of a cell involving determining an 

CC increase in the level of expression in a cell of a gene where an elevated 

CC expression relative to a known non-cancerous cell indicates a cancerous 

CC state or potentially cancerous state, an antibody that reacts with a 

CC prostate specific polypeptide, an immunocon jugate comprising the antibody 

CC and a cytotoxic agent, a method of treating cancer involving contacting a 

CC cancerous cell in vivo with an agent having activity against a prostate 

CC specific polypeptide and an immunogenic composition the prostate specific 

CC polypeptide. The prostate specific polypeptide is useful for identifying 

CC an agent that modulates the activity of a cancer related gene. The 

CC immunogenic composition is useful for treating cancer, preferably 

CC prostate cancer in an animal, e.g. human, which involves administering 

CC the immunogenic composition that is sufficient to elicit the production 

CC of cytotoxic T lymphocytes specific for the prostate specific 

CC polypeptide. The invention is useful for identifying anti-neoplastic 

CC agents. This sequence represents a human prostate specific polypeptide of 

CC the invention. 

XX 

SQ Sequence 885 AA; 

Query Match 91.5%; Score 4531.5; DB 9; Length 885; 
Best Local Similarity 99.7%; Pred. No. 0; 

Matches 855; Conservative 0; Mismatches 0; Indels 3; Gaps 2; 
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Qy 1 MRMAATAWAGLQGPPLPTLCPAVRTGLYCRDQAHAERWAMTSETSSGSHCARSRMLRRRA 60 

I I I I I I i I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 5 MRMAAT AWAGLQGP PL PT LC PAVRTGLYCRDQAH AERWAMTSET S SGS HCA — RMLRRRA 62 

Qy 61 QEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDL 120 

I I I I I I I I I I I ! I I I I I I I I II I I I I I I I I t I I 1. 1 I I I I I I I I I I I I I I I I I I I I I I II 
Db 63 QEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRI-DFVLVWEEDL 121 

Qy 121 KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 180 

I I II I I II I I II I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 122 KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 181 

Qy 181 YYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRF 240 

I I I I I I I I I I I I I I I I I I I I M I I I I 1 1 M I I 1 1 I I I I I I I I I I I M I I I I I I I I I I I I I 

Db 182 YYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRF 241 

Qy 241 LGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKT 300 

I I I I I M I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I II I I I I I t I I I I I I I I I I I I It I 
Db 24 2 LGSDNQDTFFTSTKRHQILFEILAKT PYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKT 301 

Qy 301 PPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPA 360 

I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I I. I I I I I I I I I I I I I I I I I I I II I I I 
Db 302 PPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPA 361 

Qy 361 AWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDH 4 20 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I ! I I I I I I I I I I I I I I I I I I I I I I 
Db 362 AWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDH 4 21 

Qy 421 GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPI 480 

I I I I I f I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I t I I I I I I I I I I I I 
Db 422 GGTVFFSLF^4ALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPI 481 

Qy 481 TGEDEPYFPERSRARRMLAGSWIVVMVAVVVMCLVSIILYRAIMAIVVSRSGNTLLAAW 540 

I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I 1 1 1 1 1 1 1 1 I I I II I I I M M I I I I I I I I I J I I 
Db 482 TGEDEPYFPERSRARRMLAGSWIWMVAVWMCLVSI ILYRAIMAIWSRSGNTLLAAW 541 

Qy 541 ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 600 

I I I I I I I I I I II I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 54 2 ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 601 

Qy 601 SSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEV .660 

I I I I 1 1 I I I I I I I I M I I I I I I 1 1 I I I I I I I I I I I I 1 1 I I I I I I I I I I I I I I I I I I I I I I 

Db 602 SSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEV 661 

Qy 661 LI PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTI 720 

I I I I I I I I I I I I I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I II I I I I II I I I I I I 
Db 662 LI PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTI 721 

Qy 721 FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 780 

I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 722 FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 781 

' Qy 781 AFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYS 840 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 1 1 1 1 1 1 

Db 782 AFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYS 841 

Qy 841 QTYWNLLAIRLAFVIVFE 858 

I I I I I I II I I I I I I I I I I 
Db 842 QTYWNLLAIRLAFVIVFE 859 

RESULT 3 
AEB134 24 

ID AEB13424 standard; protein; 843 AA. 
XX 

AC AEB13424; 
XX 

DT 22-SEP-2005 (first entry) 
XX 

DE Human prostate specific polypeptide #1. 
XX 

KW Screening; diagnosis; drug delivery; prostate specific polypeptide; 

KW cancer; prostate tumor; cytostatic; neoplasm. 

XX 

OS Homo sapiens. 
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XX 

PN WO2005062788-A2. 
XX 

PD 14-JUL-2005. 
XX 

PF 16-DEC-2004; 2004WO-US04 24 06 . 
XX 

PR 22-DEC-2003; 20O3US-O531809P . 
XX 

PA (AVAL-) AVALON PHARM INC. 
XX 

PI Weigle B, Ebner R; 
XX 

DR WPI; 2005-497793/50. 

DR N-PSDB; AEB13423. 
XX 

PT Novel isolated prostate specific polypeptide, useful for treating cancer, 

PT and identifying agent that modulates activity of cancer related gene. 
XX 

PS Claim 12; SEQ ID NO 3; 59pp; English. 
XX 

CC The invention relates to an isolated prostate specific polypeptide 

CC comprising one or more immunogenic fragments. The invention also relates 

CC to a method of identifying an agent that modulates the activity of a 

CC cancer related gene involving contacting a compound with a cell 

CC containing a gene under conditions promoting the expression of the gene, 

CC detecting a difference in expression of the gene relative to when the 

CC compound is not present and identifying an agent that modulates the 

CC activity of a cancer related gene, a method of identifying an anti- 

CC neoplastic agent involving contacting a cell exhibiting neoplastic 

CC activity with a compound first identified as a cancer related gene 

CC modulator using and determining a decrease in neoplastic activity after 

CC contacting, when compared to when the contacting does not occur, or 

CC administering an agent first identified to an animal exhibiting a cancer 

CC condition and detecting a decrease in cancerous condition, a method of 

CC determining the cancerous status of a cell involving determining an 

CC increase in the level of expression in a cell of a gene where an elevated 

CC expression relative to a known non-cancerous cell indicates a cancerous 

CC state or potentially cancerous state, an antibody that reacts with a 

CC prostate specific polypeptide, an immunocon jugate comprising the antibody 

CC and a cytotoxic agent, a method of treating cancer involving contacting a 

CC cancerous cell in vivo with an agent having activity against a prostate 



CC specific polypeptide and an immunogenic composition the prostate specific 

CC polypeptide. The prostate specific polypeptide is useful for identifying 

CC an agent that modulates the activity of a cancer related gene. The 

CC immunogenic composition is useful for treating cancer, preferably 

CC prostate cancer in an animal, e.g. human, which involves administering 

CC the immunogenic composition that is sufficient to elicit the production 

CC of cytotoxic T lymphocytes specific for the prostate specific 

CC polypeptide. The invention is useful for identifying anti-neoplastic 

CC agents. This sequence represents a human prostate specific polypeptide of 

CC the invention. 

XX 

SQ Sequence 84 3 AA; 



Query Match 88.2%; Score 4364.5; DB 9; Length 843; 

Best Local Similarity 99.6%; Pred. No. 0; 

Matches 824; Conservative 0; Mismatches 0; Indels 3; Gaps 2; 



Qy 


i 


MRMAAT AWAGLQGP PL PTLC PAVRTGLYCRDQAH AERWAMTS ET S SGS HCARSRMLRRRA 


60 






1 1 1 1 1 ( 1 1 1 1 1 1 I 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1! 1 1 1 1 1 1 ! ! 1 1 1 1 I 1 




Db 


5 


MRMAAT AWAGLQGP PL PT LC PAVRTGLYCRDQAH AE RWAMTS ET S SGS HC A— RMLRRRA 


62 


Qy 


61 


QEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDL 


120 






1 1 1 1 i I 1 1 1 1 1 1 1 1 1 1 1 i i 1 1 1 1 I 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


63 


QEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRI-DFVLVWEEDL 


121 


Qy 


121 


KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 II 1 1 1 1 1 1 i 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 II 1 1 1 1 1 




Db 


122 


KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 


181 


Qy 


181 


YYAEDLRLKLPLQELPNQASNWSAGLLAWLGI PNVLLEWPDVPPEYYSCRFRVNKLPRF 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


182 


YYAEDLRLKLPLQELPNQASNWSAGLLAWLGI PNVLLEWPDVPPEYYSCRFRVNKLPRF 


241 


Qy 


241 


LGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKT 


300 
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I I ! I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I 



Db 


242 


LGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKT 


301 


D\r 

vY 


301 


PPFfiPftAPRT.NOROVT FOHWARWfiKWMK'YOPT nwVRRYFfiFKVAT YFAWT fJFYTr*WT T PA 


360 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


302 


PPEGPQAPRL N QRQ VL FQ H WARWG KWN KYQ PL D H VR RY FG E KVAL Y FAWLGF YT GWL L P A 


361 


oy 


J 0 1 


nv W\j1. L» v r jjVLi^r jj vr suiri yiiiLik*ijC5n.Uor Ejri^rLi^LiLA^rr »VLiLiooAw\LiAyA(jKLir Un 


a on 






1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


362 


AWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDH 


421 


Qy 


H £. i. 


LjIjI VC r oJj C riALirVA VJj L»L»tj I lrMj\Ki\oAI Li A I KViU^OU IZjUI CjCiKtr Kr yt AADArTll ftrN r 1 


a on 






II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! II 1 




Db 


422 


GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPI 


4 81 


Qy 




TrmrDVrDrDCDlDDMT ZlPC^A/^A/MVIWAA/TS/PT \/CT TT VD Zi TM7\ T\A/C DCrMTT T 71 BT»1 
1 vjbUbr I r rtiKoKAKrvJ^ij/Mjo V VI V VrlVAV V VPIULiVdI 1 Li iKAInAiVVoKbuN 1 LiLiAAW 


£ a n 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


482 


TGEDEPYFPERSPARRJ^LAGSWIW^AVV^CLVSIILYRAIMAIWSRSGNTLIAAW 


541 


oy 


^4 1 


noKi f\z>Ll oo VVlNLVr ILlLofvl I vbLiAnVLil KWbMnKI yi l\r bDAr L Ln.Vt it yt VN r I 


DUU 






1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 l-l 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 




Db 


542 


ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 


601 


Qy 


GUI 


cc di/vt 71 cr/ro n/rvDrwvuTT m/DMcrm sr/VT tct anrT tvt ka\ ir^ vr\\ r t i 
bbfv I lAfc r KljKr VLji ftjN in! Lr UvKNbb^AAlj(^LlbLAybLLiVlMVGKyVINNMyhV 


ooU 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 II 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


602 


S S P VY IAFFKGRFVGYPGNYHTLFGVRNEE C AAGGC L I EL AQ EL L V I MVGKQ V I NNMQ E V 


661 


Qy 


DDI 


LI rl\.Li\(jWWyj\r KLKbKKKl^uAbALjAbyurWhUU I bLVFCbGlir Ubl LbMVLQr Gc VI X 


720 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 I 1 I I I I 




Db 


662 


LIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTI 


721 


Qy 


"7 0 1 


tT\/7\ 7\ PDT 7\ DT IT7\ T T M\TT»7\ /C TOT IMl D/n/rTVDDDUTl rDJrtHT t~ TMCU TT TUT ATfTCVl 

r VAAUfliAFLt ALLNNWVblKLIJAKKfc VLb iKKr VAbKAyUIGI Wt H1LAGLI HLAVXSN 


780 






t I I I I I I I I 1 I I I I I 1 I I I I I I I'l I I I I I I I i I I t I I I I I I I i I I I I I i i i i i i i i i i i i 
II I II II 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | | | | | | 




Db 


722 


FVAAC PLAPL FALLNNWVE I RLDARKFVCE YRRPVAERAQDI GI WFH I LAGLT H LAVI SN 


781 


Qy 


781 


AFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTC 827 








1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


782 


AFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTC 828 





RESULT 4 
ABG15488 

ID ABG15488 standard; protein; 898 AA. 
XX 

AC ABG15488; 
XX 

DT 18-FEB-2002 (first entry) 
XX 

DE Novel human diagnostic protein #15479. 
XX 

KW Human; chromosome mapping; gene mapping; gene therapy; forensic; 

KW food supplement; medical imaging; diagnostic; genetic disorder. 
XX 

OS Homo sapiens. 
XX 

PN WO200175067-A2. 
XX 

PD ll-OCT-2001. 
XX 

PF 30-MAR-2001; 2001WO-US008631 . 
XX 

PR 31-MAR-2000; 2000US-0054 02 17 . 

PR 23-AUG-2000; 2000US-0064 91 67 . 
XX 

PA ( HYSE- ) HYSEQ INC. 
XX 

PI Drmanac RT, Liu C, Tang YT; 
XX 

DR WPI; 2001-639362/73. 

DR N-PSDB; AAS796.75. 
XX 



PT New isolated polynucleotide and encoded polypeptides, useful in 

PT diagnostics, forensics, gene mapping, identification of mutations 

PT responsible for genetic disorders or other traits and to assess 

PT biodiversity. 



XX 
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PS Claim 20; SEQ ID NO 45847; 103pp; English. 
XX 

CC The invention relates to isolated polynucleotide (I) and polypeptide (II) 

CC sequences. (I) is useful as hybridisation probes, polymerase chain 

CC reaction (PCR) primers, oligomers, and for chromosome and gene mapping, 

CC and in recombinant production of (II). The polynucleotides are also used 

CC in diagnostics as expressed sequence tags for identifying expressed 

CC genes. (I) is useful in gene therapy techniques to restore normal 

CC activity of (II) or to treat disease states involving (II). (II) is 

CC useful for generating antibodies against it, detecting or quantitating a 

CC polypeptide in tissue, as molecular weight markers and as a food 

CC supplement. (II) and its binding partners are useful in medical imaging 

CC of sites expressing (II). (I) and (II) are useful for treating disorders 

CC involving aberrant protein expression or biological activity. The 

CC polypeptide and polynucleotide sequences have applications in 

CC diagnostics, forensics, gene mapping, identification of mutations 

CC responsible for genetic disorders or other traits to assess biodiversity 

CC and to produce other types of data and products dependent on DNA and 

CC amino acid sequences. ABG00010-ABG30377 represent novel human diagnostic 

CC amino acid sequences of the invention. Note: The sequence data for this 

CC patent did not appear in the printed specification, but was obtained in 

CC electronic format directly from WIPO at 

CC ftp.wipo.int/pub/published_pct_sequences 

XX 

SQ Sequence 898 AA; 

Query Match 75.5%; Score 3736; DB 4; Length 898; 

Best Local Similarity 82.3%; Pred. No. 0; 

Matches 727; Conservative 4; Mismatches 16; Indels 136; Gaps 6; 

MRMAAT AWAGLQGP PLPTLC PAVRTGLYCRDQAH AER 37 

I I I I I I I I I I I I I I I II I I I I I I I I I I II I I I I I I I I 

MRMAAT AWAGLQGP PL PT LC PAVRTGLYCRDQAH AE RAT DWLL AP FCQPKT RS HGT C PP 60 



Ov 
vy 


i 


Db 


i 


Ov 
vy 


38 


Db 


61 


Ov 


• 48 


Db 


121 


Qy 


88 


Db 


181 


Qy 


148 


Db 


241 


Qy 


208 


Db 


301 


Qy 


268 


Db 


361 


Qy 


328 


Db 


421 


Qy 


388 


Db 


481 


Qy 


444 


Db 


541 


Qy 


504 


Db 


601 


Qy 


564 



-W AMTSETS SG 47 

I I : I I I 



SHCA RSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYGSTAH 87 

III : I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I 

SPCAHGQESLPSQPSPILLRVESVKSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYGSTAH 180 

AS E P GGQQ AAAC RAGSPAKPRIAD FVL VWE ED L KL D RQQD S AARD RT DMH RT WRET FL DN 147 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
AS EP GGQQ AAAC RAGS PAKPRI7\DFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET FLDN 240 

LrAaGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLL 207 
I I ! I I 1 I I f I I I I I I I II I I I I I I I I i I I I I I I I I I I I I I I I I II ! : I : 
LRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQDYPTRPPTGRPACC 300 

AWLGIPNVLLEyVPDyPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTP 2 67 

1 1 1 1 1 1 1 1 1 1 i i m i i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 1 i 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 

AWLGIPNVLLEWPDVPPEYYSCRFRWKLPRFLGSDNQDTFFTSTKRHQILFEILAKTP 360 

YGHEKKNLLGIHQLLAEGVXSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWN 327 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 1 I I I I I 1 I I I I I I i I I I I i I I I I I I I I 
YGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWN 4 20 

KYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGS 387 

I I I I I ( II I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

ICY Q P L D H VRR Y FGE KVAL Y F AWLG FYT GWL L P AA WGT L V FL VGC FLVFSDIPTQELCGS 4 80 

KDSFEMCPLCLDCPFWLLSSACALAQ AGRLFDHGGTVFFSLFMALWAVLLLEYWKR 4 4 3 

I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I ! I I I I I I I I I I 

KDSFEMCPLCLDCPFWLLSSACALAQVREEAGRLFDHGGTVFFSLFMALWAVLLLEYWKR 540 

KSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSW 503 

1 1 1 1 1 M 1 1 i 1 1 1 1 1 1 1 1 n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M i M M 

KSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSW 600 

IVAWAVW4CLVSIILYRAIMAIVVSRSGNTLLAAWASRIASLTGSVVNLVFILILSKI 563 
I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I II II I I I I I I I I I I I I I I 
IV^^AVVV>1CLVSIILYRAIMAIVVSRSGNTLLAAWASRIASLTGSVVNLVFILILSKI 660 



I I I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
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i A 



Db 661 YVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTL 720 

Qy 624 FGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGA 683 

I I I I I I I I I M I 1 1 I I I I I I I I I I I I I I M I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 721 FGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGA 780 

Qy 684 SAGAS QGPWEDDYELVPCEGLFDEYLEM 71.1 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I 

Db 781 SAGASQGPWEDDYELVPCEGLFDEYLEMGAGFCPNACPELVPELTEPEKARDQPEARSAG 84 0 



Qy 712 VLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKF 747 

I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I I 
Db 841 QDSRPEAVLQFGFVTI FVAACPLAPLFALLNNWVEIRLDARKF 88 3 



RESULT 5 
ADB64420 

ID ADB64420 standard; protein; 920 AA. 
XX 

AC ADB64420; 
XX 

DT 04-DEC-2003 (first entry) 
XX 

DE Human protein encoded by clone FEBRA20031280. 
XX 

KW Human; pharmaceutical; diagnostic; gene therapy; tissue regeneration; 

KW cell regeneration; membrane protein; signal transduction-rela ted protein; 

KW transcription-related protein; osteoporosis; neurological disease; 

KW cancer; tumour. 

XX 

OS Homo sapiens. 
XX 

PN EP1308459-A2. 
XX 

PD 07-MAY-2003. 
XX 

PF 28-MAR-2002; 2002EP-00007401 . 
XX 

PR 05-NOV-2001; 2001 JP-00379298 . 

PR 25-JAN-2002; 2002US-00350978 . 
XX 

PA (HELI-) HELIX RES INST. 

PA (REAS-) RES ASSOC BIOTECHNOLOGY. 

XX 

PI Isogai T, Sugiyama T , Otsuki T, Wakamatsu A, Sato H, Ishii S; 

PI Yamamoto J, Isono Y, Hio Y, Otsuka K, Nagai K, Irie R, TamechiJca I; 

PI Seki N, Yoshikawa T, Otsuka M, Nagahari K, Masuho Y; 

XX 

DR WPI; 2003-450961/43. 

DR N-PSDB; ADB624 50. 
XX 

PT New polynucleotides and polypeptides, useful for developing a diagnostic . 

PT marker or medicines for regulation of their expression and activity, or 

PT as targets of gene therapy. 
XX 

PS Claim 1; Page; 222pp; English. 
XX 

CC The invention discloses a polynucleotide comprising a sequence selected 

CC from 1970 fully defined nucleotide sequences which encode novel 

CC polypeptides. Also claimed is a polypeptide encoded by the polynucleotide 

CC or its partial peptide, an antibody binding to the polypeptide or peptide 

CC of the polynucleotide, immunologically assaying the polypeptide or 

CC peptide of the polynucleotide by contacting the polypeptide or peptide 

CC with the antibody of the encoded protein, and observing the binding 

CC between the two, a transformant carrying the polynucleotide in an 

CC expressible manner and an antisense polynucleotide. The oligonucleotide 

CC is useful as a primer for synthesising the polynucleotide, or as a probe 

CC for detecting the polynucleotide. The polynucleotides and encoded 

CC proteins are useful as pharmaceutical agents and many disease-related 

CC genes may be included in them, for developing a diagnostic marker or 

CC medicines for regulation of their expression and activity, or as targets 

CC of gene . therapy . The genes are involved in tissue and/or cell 

CC regeneration. Membrane proteins, signal transduction-rela ted proteins, 

CC transcription-related proteins, disease-related proteins and genes 

CC encoding them can be used as indicators for diseases (e.g. osteoporosis, 

CC neurological diseases, cancer, tumours. The cDNA may be used to regulate 
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CC the activity or expression of the encoded protein to treat diseases. The 

CC sequence presented is a protein of the invention. Note: Some of the 

CC sequence data for this patent is not represented in the printed 

CC specification, but is based on sequence information supplied by the 

CC European Patent Office. 

XX 

SQ Sequence 920 AA; 

Query Match 30.9%; Score 1531.5; DB 7; Length 920; 

Best Local Similarity 37.9%; Pred. No. 2.5e-14 8; 

Matches 360; Conservative 168; Mismatches 316; Indels 105; Gaps 29; 

TSSGSHCARSRMLRRRAQEEDSTVLID VSPPEAE KRGSYGST AHASEP 91 

: I I I :: :: I : I : I I III: I : II I I 

SSSGITNGKTKVFHPVA--KDVNILFDELEAVSSPCKDDDSLLHPGNLTSTSDDASRLEA 61 



II: : | | | | : : | | : : : :| : Ml 
GGETVPERNKSNGLYFRDGKCRI-DYILVYRK SNPQTEK REVFER 105 



Ov 


44 


Db 


4 


Ov 
vy 


92 


Db 


62 


Ov 


147 


Db 


106 


Ov 
vy 


203 




lOJ 


Ov 
vy 


254 


Db 


224 


Ov 
vy 


314 


Db 


283 


Ov 
vy 


374 


Db 


34 3 


Ov 
vy 


433 


Db 


402 


Ov 
vy 


492 


Db 


4 62 


Ov 
vy 


54 6 


Db 


515 


Ov 
vy 


604 


Db 


574 


Qy 


663 


Db 


634 


Qy 


721 


Db 


691 


Qy 


781 


Db 


751 


Qy 


817 


Db 


811 


Qy * 


872 


Db 


871 



Mil 



-EWPDVPP-EYYSCRFRVNKLPRFLGSDNQDTFFTST 253 
I : I I : : I : I :: |: I :: I I I : 



KRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQR 313 

I : I : II: I I I I : I : : : I I I I I I I I I : I :: : II 
TRSRIVHHILQRIKY-EEGKNKIGLNRLLTNGSYEAAFPLHEGSYRSKNSIRTHGAENHR 282 

QVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCF 37 3 

: I : : II II I I I I I I I I I I I I I I I: I I I I I ! I : I I I I III : I III I 
HLLYECWASWGVWKYQPLDLVRRYFGEKIGLYFAWLGWYTGMLFPAAFIGLFVFLYGVT 34 2 

LVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMAL 432 

: : : | : | : | I I I : I III II : I I : I I I : I I I I I : : I I I : 

TLDHSQVSKEVCQATDII-MCPVCDKYCPFMRLSDSCVYAKVTHLFDNGATVFFAVFMAV 401 

WAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAAS-APMTAPNPITGEDEPYFPER 4 91 
I I : I I : I I I : I : I I I I I : I : I I I I I I I : I I I : |: I I I 

WATVFLEFWKRRRAVIAYDWDLIDWEEEEEEIRPQFEAKYSKKERMNPISGKPEPYQAFT 4 61 

SRARRMLAGSWIWMVAVWMCLVS 1 1 LYRAIMAI WSRSGNTLLA-AWA SRIA 54 5 

: t : : : I I : I I : : I : : I I : : I I II I : : I 

DKCSRLIVSASGIFFMICWIAAVFGIVIYRWTV STFAAFKWALIRNNSQVA 514 



: I I : I I I :: I : : I : I : I I I I I : : : : I : : I I I I : I : I I I I I II 
T-TGTAVCINFCIIMLLNVLYEKVALLLTNLEQPRTESEWENSFTLKMFLFQFVNLNSST 573 

VY I AFFKGRFVGYPGN YHT L FG- VRN E EC AAGGCL I E LAQE LL VI MVGKQV I NNMQE VL I 662 

I I I I I I ! I I : I I I i t III. I I I t : I :: : I II I I I I I : 
FYIAFFLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGI IMVLKQTWNN FMELGY 633 

PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTI 720 
I :: I I : I : :: I I I I I I I I M I I I I I I I : I I I i I I I 

PLIQNWWTR RKVRQEHGPERKISFPQWEKDYNLQPMNAYGLFDEYLEMILQFGFTTI 690 

FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 780 
I I I I I I I II I I I I I : I I I I I I III : : I I I : I I I : I I I I I i II I : I : I I : ! 
FVAAFPLAPLLALLNN I IEI RLDAYKFVTQWRRPLASRAKDIGIWYGILEGIGI LSVITN 7 50 

AFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA RAP 816 

I I :: I : I I I : I I I : : | :: | :| : 

AFVIAITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRISDFENRSEPESDG 810 

SSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLL 871 

II: : I I I I : I I I :. : I : : I I I I I I : I III I : I I : I : I 

SEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQFWHVLAARLAFIIVFEHLVFCIKHLISYL 870 



:||:|: : :::|| II :: : I |: |: : I : I 

IPDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGKAHHNEW 919 
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RESULT 6 
ABP58666 

ID ABP58666 standard; protein; 920 AA. 

xx' 

AC ABP58666; 
XX 

DT 24-MAR-2003 (first entry) 

XX . 

DE Human dihydropyrimidinase related protein 1-101.20. 
XX 

KW Human; dihydropyrimidinase related protein 1-101.20; 

KW recombinant production; gene therapy; psychosis; development disorder; 

KW uracil-related metabolic disorder; thymine-related metabolic disorder; 

KW pyrimidine metabolic disorder. 
XX 

OS Homo sapiens. 
XX 

PN CN1364894-A. 
XX 

PD 21-AUG-2002. 
XX 

PF 10-JAN-2001; 2001CN-001051 95 . 
XX 

PR 10-JAN-2001; 2001CN-00105195 . 
XX 

PA (BIOW-) BIOWINDOW GENE DEV INC SHANGHAI. 
XX 

PI Mao Y, Xie Y; 
XX 

DR WPI; 2003-000532/01. 

DR N-PSDB; ABZ57080. 
XX 

PT New polypeptide-human dihydropyrimidinase relative protein 1-101, 20 and 

PT polynucleotide for encoding such polypeptide. 

XX 

PS Claim 1; Page 28-30 (Disclosure); 36pp; Chinese. 
XX 

CC The invention relates to human dihydropyrimidinase related protein 1- 

CC 101.20 (ABP58666) and nucleic acids encoding if (ABZ57080) . The protein 

CC has a molecular weight of 101.2 kD. The invention also relates to a 

CC method for the recombinant production of the protein, an antagonist of 

CC the protein, and the use of the protein, gene and antagonist in 

CC therapeutic applications. Dihydropyrimidinase related protein 1-101.20 

CC can be used in the treatment of a variety of diseases such as psychosis, 

CC development disorders and uracil- and thymine-related metabolic 

CC disorders. The present sequence represents human dihydropyrimidinase 

CC related protein 1-101.20 

XX 

SQ Sequence 920 AA; 

Query Match 30.5%; Score 1511.5; DB 6; Length 920; 
Best Local Similarity 37.6%; Pred. No. 3e-14 6; 

Matches 357; Conservative 169; Mismatches 318; Indels 105; Gaps 29; 

Qy 4 4 TSSGSHCARSRMLRRRAQEEDSTVLID VSPPEAE KRGS YGST AHASEP 91 

: I i I : : : : 1 : I : I I ill: I : I I I I 

Db 4 SSSGITNGKTKVFH PVA — KDVNILFDELEAVSSPCKDDDSLLHPGNLTSTSDDASRLEA 61 

Qy 92 GGQQAAACRAGS PAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET FLD 14 6 

II: : III I::||: : : :|: II I 

Db 62 GGETVPERNKSNGLYFRDGKCRI-DYILVYRK SNPQTEK REVFER 105 

Qy 14 7 NLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQE LPNQASNW 202 

1:11 II :::: |: : : I I I II III : :::| : II : 

Db 106 N I RAEGLQMEKESS L I -N S D 1 1 FVKLH APWEVLGRYAEQMN VRMP FRRKI Y YL PRRYKFM 164 

Qy 203 S -AGLLAWLGIPNVLL — EWPDVPP-EYYSCRFRVNKLPRFLGSDNQDTFFTST 253 

I : II : | | : | | : : I : I :: |: I :: I I I : 

Db 165 SRIDKQISRFRRWLPKKPMRLDKETLPDLEENDCYTAPFSQQRIHHFI-IHNKETFFNNA 223 

Qy 254 KRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQR 313 

I : I : II: I I I I : I : : : I I I I I I I I I : I ::: II 

Db 224 TRSRIVHHILQRIKY-EEGKNKIGLNRLLTNGSYEAAFPLHEGSYRSKNSIRTHGAENHR 282 

Qy 314 QVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCF 373 
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Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

"Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 
Db 

Qy 

Db 



II II I II MM I II II III: III MUM II I Ml :| Ml i 
283 HLLYECWASWGVWYKYQPLDLVRRYFGEKIGLYFAWLGWYTGMLFPAAFIGLFVFLYGVT 342 

374 LVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMAL 4 32 

: : : I : I : I I M : I III MM | : II I : i M I I :: I I I : 

34 3 TLDHSQVSKEVCQATDII-MCPVCDKYCPFMRLSDSCVYAKVTHLFDNGATVFFAVFMAV 401 

4 33 WAVLLL$YWKRKSATLAYRWDCSDYEDTEERPRPQFAAS-APMTAPNPITGEDEPYFPER 4 91 

M : II : I I I :• I Ml II I M : II I I I I I : ' I I I : I : III 
402 WATVFLEFWKRRRAVIAYDWDLIDWEEEEEEIRPQFEAKYSKKERMNPISGKPEPYQAFT 4 61 

492 SRARRMLAGSWIWMVAVWMCLVSIILYRAI^4AIWSRSGNTLLA-AWA SRIA 545 

: I : : : I MM: : |::|| : : I | || MM 

4 62 DKCSRLIVSASGIFFMICWIAAVFGIVIYRWTV ST FAAFKWAL I RNN SQVA 514 

54 6 SLTGSW — NLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSP 603 

: M : I I I : : I : : I M Ml I II : : : : I : M I I I : I M II I I M 
515 T-TGTAVCINFCIIMLLNVLYEKVALLLTNLEQPRTESEWENSFTLKMFLFQFVNLNSST 573 

604 VYIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLI 662 

I II I I I II I : II I I I I I I MUM :: : I II M M I : 
574 FYIAFFLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGI IMVLKQTWNNFMELGY 633 

663 PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTI 7 20 

I : : II : I : I I I I II I I I I M I I I I I M II I I II 

634 PLIQNWWTR— RKVRQEHGPERKISFPQWEKDYNLQPMNAYGLYDEYLEMILQFGFTTI 690 

721 FVAACPLAPLFALLNNWVEI RLDARKFVCEYRRPVAERAQDI GI WFHI LAGLTHLAVI SN 780 

MM I I II I Mill : I I I II I III :: I I I : I I I : I I I I : M I : MUM 
691 FVAAFPLAPLLALLNN I I EI RLDAYKFVTQWRRPLASRAKDI GI GYGI LEGI GI LS VITN 750 

781 AFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA RAP 816 

M : : I =111 = 11 I : : I : : I M : 

751 AFVIAITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRISDFENRSEPESDG 810 

817 SSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLL 871 

II: : MM Ml I : M : M I I I M M M M M I : I : I 

811 SEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQFWHVLAARLAFIIVFEHLVFCIKHLISYL 870 

872 VPDI PESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELSSHW 920 

: I MM : :::M II :: : I |: I: : I : I 

871 IPDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGKAHHNEW 919 



RESULT 7 
ADK52il4 
ID 
XX 
AC 



XX 
DT. 
XX 
DE 
XX 
KW 
KW 
XX 
OS 
XX 
PN 
XX 
PD 
XX 
PF 
XX 
PR 
PR 
XX 
PA 
PA 
XX 
PI 
PI 
XX 
DR 
DR 



ADK52114 standard; protein; 981 AA. 
ADK52114; 
20-MAY-2004 (first entry) 

Human atopic dermatitis/psoriasis-associated protein #29. 

Human; atopic dermatitis; psoriasis; dermatological ; anti-inflammatory; 
antipsoriatic; rash. 

Homo sapiens. 

WO2004O16785-A1. 

2 6-FEB-2004. 

06-AUG-2003; 2003WO- JP009999 . 

06-AUG-2002; 2002 JP-0022 9319 . 
14-MAY-2003; 2003 JP-00136544 . 

(GEN0-) GENOX RES INC. 
(UYJU-) UN IV JUNTENDO. 

Itoh M, Ogawa K, Shinagawa A, Sudo H, Ogawa H, Ra C; 
Mitsuishi K; 

WPI; 2004-214514/20. 
N-PSDB; ADK52028. 
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XX 

PT Detecting atopic dermatitis or psoriasis comprises assaying levels of 

PT expression of an indicator gene at a rash site and non-rash site of a 

PT person with atopic dermatitis or psoriasis. 
XX 

PS Example 2; SEQ ID NO 14 7; 4 84pp; Japanese. 
XX 

CC The invention relates to detecting atopic dermatitis or psoriasis 

CC comprising assaying the levels of expression of an indicator gene at a 

CC rash site and non-rash site of a person with atopic dermatitis or- 

CC psoriasis, comparing these levels with those of a healthy person, and 

CC determining that if the levels of indicators are higher or lower, then 

CC this indicates the disease. Also included are a reagent for detecting 

CC atopic dermatitis or psoriasis, a kit for screening for treatments, a 

CC transgenic non human vertebrate animal models for the diseases, an agent 

CC for inducing the diseases in .mice and a DNA chip for assaying for the 

CC indicator genes. The method is used for treatment, detection and animal 

CC models for research of atopic dermatitis and psoriasis. The present 

CC sequence is a protein encoded by an indicator gene of the invention. 

XX 

SQ Sequence 981 AA; 

Query Match 30.4%; Score 1504; DB 8; Length 981; 

Best Local Similarity 39.4%; Pred. No. 2e-145; 

Matches 329; Conservative 163; Mismatches 268; Indels 76; Gaps 24; 

Qy 106 KP R I AD FV L VWE EDLKLDRQQD S AAR D RT DMH RT WRET FL DN L RAAGL CV D QQD VQ D GNT 165 

I I I I :: I I : : I:: ill I I I I I I : :: : : 

Db 161 KRRI -DYILVYR KTNI PYDKRNTFEKNLRAEGLMLEKEPA- IASP 203 

Qy 166 TVHYALLSASWAVLCYYAEDLRLKLPLQ ELPNQASNWSAGLLAWLGIPNV 215 

: : : -I II III I :::| : : : : : I : : 

Db 204 DIMFIKIHI PWDTLCKYAERLN I RMP FRKKCYYT DGRS KSMGRMQT YFRRI KDWMAQN PM 263 

Qy 216 LLE — WPDV-PPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPY — GH 270 

:|: II: : I: I :: I: :|:||||:: I :|:: :| :| I I 
Db 264 VLDKSAFPDLEESDCYTGPFSRARIHHFI-INNKDTFFSNATRSRIVYHMLERTKYENGI 322 

Qy 271 EKKNLLGIHQLLAEGVLSAAFPLHDGPFKT PPEGPQAPRLNQ RQVL FQ HVJARWGKW 326 

I : I I : I : I I I I I I :| :| : III I I : I :: I I I I I I 
Db 323 SK VGIRKLINNGSYIAAFPPHEGAYKSSQPIKTHGPQ NNRHLLYERWARWGMW 375 

Qy 327 NKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCG 386 

I : I I I I : I I I I I I : I I I II I I : I I I I : I I I : I I II I I : : : I I : I 
Db 376 YKHQPLDLIRLYFGEKIGLYFAWLGWYTOILIPAAIVGLCVFFYGLrrMNNSQVSQEICK 435 

Qy 387 SKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKS 445 

:: I I I I I I : I I : : I I : I I I : I I I I I I : : I I I : I I : I I : I I I : 
Db 4 36 AT EVF-MC PLCDKNCS LQRLNDSC I YAKVT YL FDNGGT VFFA I FMA I WAT VFLE FWKRRR 4 94 

Qy 44 6 ATLAYRWDCSDYEDTEERPRPQFAAS-APMTAPNPITGEDEPYFPERSRARRMLAGSWI 504 

: I I I I :: I : I I I I I I I I I I I I I : I I : I : I : I I 
Db 4 95 SILTYTWDLIEWEEEEETLRPQFEAKYYKMEIVNPITGKPEPHQPSSDKVTRLLVSVSGI 554 

Qy 505 WMVAVVVMCLVSI ILYR-AIMAIVVSRSGNTLLAAWASRIASLTGSV-VNLVFILILSK 562 

|:::|: : :::|| :| I I : I : |: :| :| : |::|: 
Db 555 FFMISLVITAVFGVWYRLWMEQFASFKWNFIKQYW — QFATSAAAVCINFI I IMLLNL 612 

Qy 563 IYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHT 622 

I : I : : I I I I I : : : : I : : I I I : I : I I I I I II I I I I I I I I I I : I I I : 
Db 613 AYEKIAYLLTNLEYPRTESEWENSFALKMFLFQFVNLNSSIFYIAFFLGRFVGHPGKYNK 672 

Qy 623 LFG-VRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKA 681 

I I I I I I I I I I : I : : I I I I I : I I I : I : : I I : : : I 
Db 673 LFDRWRLEECHPSGCLIDLCLQMGVIMFLKQIWNNFMELGYPLIQNWWSRHKI -KR 727 

Qy 682 GASAGASQGPWEDDYELVP — CEGLFDEYLEMVLQFGFVTI FVAACPLAPLFALLNNWVE 739 

i II I I : I : I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I : I 
Db 728 GIH-DASIPQWENDWNLQPMNLHGLMDEYLEMVLQFGFTTIFVAAFPLAPLLALLNNIIE 786 

Qy 740 IRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW- 798 

I I I I I I I I :: I I I : I I I I I I I III: I I I I : I I I : : I : I I : : I I I : 
Db 787 IRLDAYKFVTQWRRPLPARATDIGIWLGILEGIGILAVITNAFVIAITSDYIPRFVYEYK 846 

Qy 799 TRAHDLRGFLNFTLARAP-SSFAAAHNRTCRYRAFR DDDGHYSQTY 843 

: I:|::| :|: I : MM :l :: I 
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Db 84 7 YGPCANHVEPSENCLKGYVNNSLSFFDLSELGMGKSGYCRYRDYRGPPWSSKPYEFTLQY 906 

Qy 84 4 WNLLAIRLAFVIVFEHWFSVGRLLDLLVPDI PESVEIKVKREYYLAKQALAENEV 899 

I:: I I I I I I : I I I I I : I I : : I : I I : I : : : : : I I I I : : : I I : 

Db 907 WHILAARLAFIIVFEHLVFGIKSFIAYLIPDVPKGLHDRIRREKYLVQEMMYEAEL 962 

RESULT 8 
AEG11142 

ID AEG11142 standard; protein; 960 AA. 
XX 

AC AEG1114 2; 
XX 

DT 20-APR-2006 (first entry) 
XX 

DE Human transmembrane protein 16A, SEQ ID NO: 7. 
XX 

KW Genetic marker; diagnostic; prognosis; gastrointestinal tumor; 

KW cytostatic; neoplasm; tumor' marker; transmembrane protein 16A. 
XX 

OS Homo sapiens. 
XX 

PN US2006040292-A1. 
XX 

PD 23-FEB-2006. 
XX 

PF 08-JUL-2005; 2005US-00177894 . 
XX 

PR 08-JUL-2004; 2004US-0586676P . 
XX 

PA (WEST/) WEST R B. 

PA (VRIJ/) VAN DE RIJN M. 

XX 

PI West RB, Van De Rijn M; 
XX 

DR WPI; 2006-182760/19. 

DR N-PSDB; AEG11136. 

DR REFSEQ; NP_060513. 
XX 

PT Classifying tumor as gastrointestinal stromal tumor belonging to PDGFRA 

PT positive subclass, involves detecting expression or activity. of gene 

PT encoding D0G1 polypeptide in sample. 
XX 

PS Disclosure; SEQ ID NO 7; 177pp; English. 
XX 

CC The present invention relates to three gene markers such as DOG1, KIT and 

CC platelet derived-growth factor receptor alpha (PDGFRA) that are useful in 

CC classifying tumors. These gene markers are useful in the classification 

CC of gastrointestinal stromal tumors (GISTs) and tumors other than GISTs. 

CC The invention also relates to methods providing diagnostic, prognostic 

CC and predicative information based on the classifying step. The invention 

CC is useful for classifying gastrointestinal stromal tumors as belonging to 

CC a PDGFRA positive subclass, KIT negative or PDGFRA negative subclass. The 

CC present sequence is human transmembrane protein 16A (D0G1; TMEM16A) . The 

CC DOG1 gene encodes a transmembrane protein of unknown function 

CC (transmembrane protein 16A) . The transmembrane protein 16A is encoded by 

CC D0G1 gene that is mapped to llql3.2 on chromosome 11. 
XX 

SQ Sequence 960 AA; 

Query Match 30.1%; Score 1488; DB 10; Length 960; 
Best Local Similarity 37.6%; Pred. No. 8.6e-144; 

Matches 363; Conservative 160; Mismatches 307; Indels 136; Gaps 28; 

Qy 26 GLYCRDQAHAERWAMT — SETSSGSHCARSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYG 83 

III II : : : I I : I I I I I I : I 

Db • 52 GLYFRDGRRKVDYILVYHHKRPSG NRTLVRRVQHSDTP SGA 92 

Qy 84 ST AH AS E P GGQQ AAAC RAGSPAKPRIAD FVL VWE E D LK L D RQQD S AARDRT DMH RT WRET 143 

: I : I : I I I I ' :| :| III 

Db 93 RSVKQDHPLPGKGASLDAGSGEPP MDYHEDD — KRFRREE 130 

Qy 14 4 FLDNLRAAGLCVDQQDVQDGNTTVH YALLSASWAVLCYYAEDLRLKLPLQELPNQAS 200 

: II III:: : I : I : I : :. I I I I I I I I : I I : I : : : : 
Db 131 YEGNLLEAGLELE RDEDTKIHGVGFVKIHAPWNVLCREAEFLKLKMPTKKMYH — I 184 
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Qy 201 NWSAGLLAWLGI PNVLLEWPDVPPEYYSCR FRVNKLPRFLGSDNQDTFF 250 

I : Ml ■ I :|| :: : |: I' I I I II :|:|| 

Db 185 NETRGLLK— KINSVLQKITDPIQPKVAEHRPQTMKRLSYPFSREKQHLFDLSD-KDSFF 241 

Qy 251 TSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRL 310 

.1 I I : : I I I : I I : : I I I I I II : I I : I I I I I : : 
Db 24 2 DSKTRSTIVYEILKRTTCTKAKYS-MGITSLLANGVYAAAYPLHDGDY NGENVEF 295 

Qy 311 NQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLV 370 

I I : : I : : I I I : I : I I I I : I II : I I I I I : I I I I I I I II I : I I : : I I : I I I 

Db 296 NDRKLLYEEWARYGVFYKYQPI DLVRKYFGEKIGLYFAWLGVYTQMLI PASIVGI I VFLY 355 

Qy 371 GCFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLF 429 

II : : I I : I : I : : I I I I I I : I : I I I I I I : I III: I I I I I : I 

Db 356 GCATMDENIPSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVF 415 

Qy 4 30 MALWAVLLLEYWKRKSATLAYRWDCSDYEDTEE RPRPQFAA SAPMTAPNPI 4 80 

I I I I I : I : I I II I I I I I : : I : I I I I : : I I : I 

Db 416 MALW AAT FME HWKRKQMRLN YRWD LT G FE E E E EA VKD H PRAE Y E ARVL EKSLKKESRNKE 475 

Qy 481 TGEDEPYFPERSRARRMLAGSWIWMVAVWMCLVSIILYRAIMAIWSRSGNTLLAAW 540 

I I : II I I I : I : I I : : : I : I I II : : : : : : 

Db 4 7 6 T — DKVKLTWRDRFPAYLTNLVS 1 1 FMIAVTFAIVLGVI IYRI SMAAALAMNSSPSVRSN 533 

Qy 541 ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 600 

: I : : I I I I : : I : : I : I I I : I : : I : I I : I I :: I I I I 
Db 534 IRVTVTATAVI INLWI I LLDEVYGC I ARWLTKI EVPKTEKSFEERLI FKAFLLKFVNSY 593 

Qy 601 SSPVYIAFFKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELAQELLVIMVGKQVI-NNMQ 658 

: I : I I I I I I I I I I I : I : I I I I II I I I I : I I : I : I I : I I I : I I I : 
Db 594 TPIFYVAFFKGRFVGRPGDYVYIFRS FRMEECAPGGCLMELCIQLSIIMLGKQLIQNNLF 653 

Qy 659 EVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFV 718 

I : I I I :| : : I : : ' : : I I I I I II I I :| I : :| I I I I 

Db 654 EIGIPKMKKLIRYLKLKQQSPPDHEECVKRKQRYEVDYNLEPFAGLTPEYMEMIIQFGFV 713 

Qy 719 TIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVI 778 

I : I I I : I I I I I I I I I I I : I I I I I I : I I I I I I I I I I I : I I I I I : : I I I : Mil 
Db 714 TLFVASFPLAPLFALLNNIIEIRLDAKKFVTELRRPVAVRAKDIGIWYNILRGIGKLAVI 773 

Qy 779 SNAFLLAFSSDFLPRA — YYRWTRAHDLRGFLNFTLARAPSSF AAAHN 824 

II I :: : I : II I : II I ::: : MM II III II 
Db 774 INAFVISFTSDFIP RL VY L YMY S KNGTMH G FVNH T L SSFNVSDFQNGTAPNDPLDL 829 

Qy 825 RTCRYRAFRD DDGHY — SQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDI 875 

: I M : : I : : I I : M M I I I I II II I : : M : . : I : : I I I 
Db 830 GYEVQICRYKDYREPPWSENKYDI SKDFWAVLAARLAFVIVFQNLVMFMSDFVDWVI PDI 889 

Qy 876 PESVEI KVKREYYLA KQALAENEVLFGTNGTKDEQP KG 913 

I : : :: : I I I I I I I II I I 

Db 890 PKDISQQIHKEKVLMVELFMREEQDKQQLL — ETWMEKERQKDEPPCNHHNTKACPDSLG 947 

Qy 914 SELSSH 919 

I II 

Db 948 SPAPSH 953 



RESULT 9 
AEG11146 

ID AEG11146 standard; protein; 840 AA. 
XX 

AC AEG11146; 
XX 

DT 20-APR-2006 (first entry) 
XX 

DE Human transmembrane protein 16A, SEQ ID NO: 11. 
XX 

KW Genetic marker; diagnostic; prognosis; gastrointestinal tumor; 

KW cytostatic; neoplasm; tumor marker; transmembrane protein 16A. 
XX 

OS Homo sapiens. 
XX 

PN US2006040292-A1. 
XX 

PD 23-FEB-2006. 
XX 
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PF 08-JUL-2005; 2005US-00177894 . 
XX 

PR 08-JUL-2004; 2004US-0586676P. 
XX 

PA (WEST/) WEST R B. 

PA (VRIJ/) VAN DE RIJN M. 

XX 

PI West RB, Van De Rijn M; 
XX 

DR WPI; 2006-182760/19. 

DR N-PSDB; AEG11141. 

DR GEN BANK; AAH33036. 
XX 

PT Classifying tumor as gastrointestinal stromal tumor belonging to PDGFRA 

PT positive subclass, involves detecting expression or activity of gene 

PT encoding DOG1 polypeptide in sample. 
XX ' 

PS Disclosure; SEQ ID NO 11; 177pp; English. 
XX 

CC The present invention relates to three gene markers such as DOG1, KIT and 

CC platelet derived-growth factor receptor alpha (PDGFRA) that are useful in 

CC classifying tumors. These gene markers are useful in the classification 

CC of gastrointestinal stromal tumors (GISTs) and tumors other than GISTs. 

CC The invention also relates to methods providing diagnostic, prognostic 

CC and predicative information based on the classifying step. The invention 

CC is useful for classifying gastrointestinal stromal tumors as belonging to 

CC a PDGFRA positive subclass, KIT negative or PDGFRA negative subclass. The 

CC present sequence is human transmembrane protein 16A (DOG1; TMEM16A) . The 

CC DOG1 gene encodes a transmembrane protein of unknown function 

CC (transmembrane protein 16A) . The transmembrane protein 16A is encoded by 

CC D0G1 gene that is mapped to llql3.2 on chromosome 11. 
XX 

SQ Sequence 84 0 AA; 

Query Match 29.9%; Score 1479.5; DB 10; Length 840; 

Best Local Similarity 40.0%; Pred. No. 5.3e-143; 

Matches 340; Conservative 152; Mismatches 270; Indels 89; Gaps 22; 

Qy ' 135 DMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVH YALLSASWAVLCYYAEDLRLKLP 191 

I I II: II III:: : I : I : I : : I I I I I II I : I I : I 
Db 6 DDKRFRREEYEGNLLEAGLELE RDEDTK I HGVGFVK I HAPWNVLCREAEFLKLKMP 61 

Qy 192 LQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCR FRVNKLPRFL 241 

: :: : I : I I I I : I I :: : I : I I I I 

Db 62 TKKMYH — INETRGLLK — KINSVLQKITDPIQPKVAEHRPQTMKRLSYPFSREKQHLFD 117 

Qy 24 2 GSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTP 301 

I I : I : I I I I I : : II I : I I : : I I I I I I I : I I : I I I I I : 
Db 118 LSD-KDSFFDSKTRSTIVYEILKRTTCTKAKYS-MGITSLLANGVYAAAYPLHDGDY 172 

Qy 302 PEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAA 361 

: I I :: I : : I I I : I : I I I I :| I I : I I I I I : I I I I I I I II I : I I : 
Db 17 3 — NGENVEFNDRKLLYEEWARYGVFYKYQPIDLVRKYFGEKIGLYFAWLGVYTQMLIPAS 230 

Qy 362 WGTLVFLVGCFLVFSDI PTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDH 420 

: I I : I I I I I : : I I : I : I : : Mill I : I : I I I I I I : I III: 
Db 231 IVGI IVFLYGCATMDENI PSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDN 290 

Qy 421 GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAA SAPMT 475 

I I I I I : I I I ! II : I : I I I I I I I I I : : I : I : I I : : I I 
Db 291 PATVFFSVFMALWAATFMEHWKRKQMRLNYRWDLTGFEEEEDHPRAEYEARVLEKSLKKE 350 

Qy 47 6 APNPITGEDEPYFPERSRARRMLAGSWIWMVAWVMCLVSIILYRAIMAIWSRSGNT 535 

: I I I: II I I I: 1:11 : : : I : I I I I : : : : 

Db 351 SRNKET — DKVKLTWRDRFPAYLTNLVS 1 1 FMI AVTFAI VLGVI IYRI SMAAALAMNSSP 408 

Qy ' 536 LLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQ 595 

: : : I : : I I I I : : I : : I : I I I : I : : I : I I : I h : 

Db 409 SVRSNIRVTVTATAVI INLWI ILLDEVYGCIARWLTKIEVPKTEKSFEERLIFKAFLLK 468 

Qy 596 FVNFYSSPVYIAFFKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELAQELLVIMVGKQVI 654 - 

till: I : I I I I I I I I I I I : I : I I I II I Nihil : I : I I : I I I : I 
Db 4 69 FVNSYTPI FYVAFFKGRFVGRPGDYVYI FRSFRMEECAPGGCLMELCIQLSI IMLGKQLI 528 

Qy 655 -NNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVL 713 

I I : I : I I I : I : : I : : : : I I I I I II I I : I I : : 
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Db 



529 QNNLFEIGIPKMKKLIRYLKLKQQSPPDHEECVKRKQRYEVDYNLEPFAGLTPEYMEMI I 588 



Qy 714 QFGFVT LFVAAC PLAPLFALLNNWVE I RLDARKFVCEYRRPVAERAQD I G IWFH I LAGLT 773 

I I I I I I : I I I : I I I I I I I I I.I I : I II I I I : I I I I I I I I I I I : I I I I I : : | | I : 
Db 589 QFGFVTLFVASFPLAPLFALLNNI IEIRLDAKKFVTELRRPVAVRAKDIGIWYNILRGIG 648 

Qy 774 HLAVISNAFLLAFSSDFLPRA— YYRWTRAHDLRGFLNFTLARAPSSF- AAAHN 824 

I I I I : I I : : : I : I I I : I I I : : : : I I : I I I I I I | | 

Db 64 9 KLAVIIDAFVISFTSDFI PRLVYLYMYSKNGTMHGFVNHTL SSFNVSDFQNGTAPN 704 

Qy 825 RTCRYRAFRD DDGHY — SQTYWNLLAIRLAFVIVFEHWFSVGRLLDL 870 

: I I I : : I : : I I : : I : I I I N I I I I I : : : I : : I 
Db 705 DPLDLGYEVQICRYKDYREPPWSENKYDISKDFWAVLAARLAFVIVFQNLVMFMSDFVDW 7 64 

Qy 871 LVPDIPESVEIKVKREYYLA KQALAENEVL FGTNGT KDEQ P 911 

::||||: : :: :| I III 1 II I I 

Db 765 VIPDIPKDISQQIHKEKVLMVELFMREEQDKQQLL— ETCMEKERQKDEPPCNHHNTKAC 822 

Qy 912 KGSELSSH 919 

II I j 

Db 823 PDSLGSPAPSH 833 



RESULT 10 
ADG4 8280 

ID ADG48280 standard; protein; 1003 AA. 
XX 

AC ADG48280; 
XX 

DT ll-MAR-2004 (first entry) 
XX 

DE Human retina-specific protein - C12orf 3variants . 
XX 

KW human; retina-specific protein; NETOl; retinal disease; 

KW age related macular degeneration; night blindness; C12orf 3variants . 

XX • 

OS Homo sapiens. 

XX 

PN WO2003068967-A2. 
XX 

PD 21-AUG-2003. 
XX 

PF 18-FEB-2003; 2003WO-EP001625 . 
XX 

PR 18-FEB-2002; 2002EP-00003675 . 

PR 21-FEB-2002; 2002US-0357857P . 
XX 

PA (LYNK-) LYNKEUS BIO TECH GMBH. 
XX 

PI Stoehr BH, Weber FHB, Goehring F; 
XX 

DR WPI; 2003-767334/72. 

DR N-PSDB; ADG48279. 
XX 

PT New nucleic acid encoding retinal protein sNETOl, useful for diagnosis of 

PT retinal disease, especially macular degeneration, also for drug screening 

PT and therapy. 
XX 

PS Claim 18; Fig 14; 199pp; English. 
XX 

CC The invention comprises the amino acid and coding sequences of a human 

CC retina-specific protein - NETOl. The DNA and protein sequences of the 

CC invention are useful in the treatment of retinal diseases, such as 

CC macular degeneration (especially age related) and night blindness. The 

CC present amino acid sequence represents the human retina-specific protein 

CC C12orf3variants. 

XX 

SQ Sequence 1003 AA; 

Query Match 29.6%; Score 1464; DB 7; Length 1003; 
Best Local Similarity 37.4%; Pred. No. 2.8e-141; 

Matches 34 4; Conservative 167; Mismatches 284; Indels 124; Gaps 27; 

Qy 80 GSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRT 139 

I I II II |::| |: :| : 

Db 129 GETGKEPHAGGPG DI ELG-PLDALEEERKEQ 158 
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Qy 14 0 WRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQ- 198 

Ml II I I I : : : I : : : : : : I I I I I I I : : I : I : : : 
Db 159 -REEFEHNLMEAGLEL-EKDLENKSQGSIFVRIHAPWQVLAREAEFLKIKVPTKKEMYEI 216 

Qy 199 ASNWSAGLLAWLGI PNVLLEWPDVPPEYYSCRFRVNKLP RFLGSDNQ 246 

I : I I I : : | | | | : : : : | : . 

Db 217 KAGGSIAKKFSAAL QKLSSHLQPRV-PEHSNNKMKNLSYPFSREKMYLYNIQEK 269 

Qy 247 DT FFTSTKRHQI LFEI LAKT PYGHEKKNLLGI HQLLAEGVLSAAFPLHDGPFKT PPEGPQ 306 

Mil: I : I : I I I : I I : I I : I : I : I I : I I I I I : : I : 

Db 270 DT FFDN AT RSRIVHEILKRTACS - RANNTMG INSLIANNI YE AA Y PLHDGEYDSPEDD — 326 

Qy 307 APRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTL 366 

: I I : : I : I I I I : I : I :.l I : I : I : I ( I I I : I I I I I I I II : I : I : : I : I : 
Db 327 MNDRKLLYQEWARYGVFYKFQPIDLI RKYFGEKI GLYFAWLGLYTS FL I PSSVI GVI 383 

Qy 367 VFLVGCFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVF 4 25 

I I I I I : I I I : : I : I : : : I I I I I I I : I I I I I I III IN: III 
Db 384 VFLYGCATIEEDIPSREMCDQQNAFTMCPLCDKSCDYWNLSSACGTAQASHLFDNPATVF 443 

Qy 426 FSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEER PRPQFAA 4 70 

I I : I I II I I : I I. I I I I I II : hill I I : : 

Db 444 FSIFMALWATMFLENWKRLQMRLGYFWDLTGIEEEEERAQEHSRPEYETKVREKMLKESN 503 

Qy 471 -SAPMTAPNPIT GEDEPYFPERSRARRMLAGSWIWMVAVWMCLVS 1 1 LYRAIM 525 

II I : I I : I I I : I : I : : : I :| I 

Db 504 QSAVQKLETNTTECGDEDDEDKLTWKDRFPGYLMNFASILFMIALTFSIVFGVIVYRITT 563 

Qy 526 AIWSRSGNTLLAAWASRI ASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTK 582 

I : I I I I : : I : : I I I I I I I : I I : : I I I : |: : I : 

Db 564 AAALS LNKATRSNVRWVTATAVIINLWILILDEIYGAVAKWLTKIEVPKTEQT 618 

Qy 583 FEDAFTLKVFI FQFVN FYSS PVYI AFFKGRFVGYPGNYHTLF-GVRNEECAAGGCLI ELA 641 

II: I I I :: I I I I I I : I I I I I I I I I I I : I : I I I I I I I I I I I : I I 
Db 619 FEERLILKAFLLKFVNAYSPIFYVAFFKGRFVGRPGSYVYVFDGYRMEECAPGGCLMELC 678 

Qy 642 QELLVI MVGKQV I -NNMQEVL I PKLKGWWQKFRLRSKKRKAGAS AGA- SQGP — WEDD YE 697 

:| :||:|||:| II: I: :|MI - : II : :|| : I I: I I : II 
Db 679 IQLS I IMLGKQL IQNN I FEI GVPKLK KLFRKLKDETEAGETDSAHSKH PEQWDLDYS 735 

Qy 698 LVPCEGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAE 7 57 

I I II I I: I I :: I I I I I I : I I I : 1111:111111 :| : I II |: I I I I III I 
Db 736 LEPYTGLT PEYMEMIIQFGFVTLFVASFPLAPVFALLNNVIEVRLDAKKFVTELRRPDAV 795 

Qy 758 RAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAHD — LRGFLNFTLA — 813 

I :| I l-l I I I I : I : : I I I I I |::| : I I I : I I I : :: : |: I I I : I I I : 
Db 796 RTKDIGIWFDILSGIGKFSVISNAFVIAITSDFIPRLVYQYSYSHNGTLHGFVNHTLSFF 855 

Qy 814 RAPSSFAAAHNRTCRYRAFRD DDGHYSQTYWNLLAIRLAFVIVFEH 859 

: : I : | | :: : I : : : I : I I : I : I I I I I I : I : : 

Db 856 NVSQLKEGTQPENSQFDQEVQFCRFKDYREPPWAPNPYEFSKQYWFILSARLAFVI IFQN 915 

Qy 860 WFS VGRLLDLLVPD I PE SVE I KVKRE YYLAKQALAENEVLFGTNGTKDEQPKG 913 

:| : |:| ::|||| : ::|:| ::| : I : I I : II 
Db 916 LVMFLSVLVDWMI PDI PT DI SDQI KKEKSLLVDFFLKE EHEKLKLMDEPALRSPGG 971 

Qy 914 SELSSHOT PFTVPKA- SQL 931 

: I : I III 

Db 972 GDRS RS RAAS SAPSGQSQL 990 

RESULT 11 
ABB62812 

ID ABB62812 standard; protein; 1219 AA. 
XX 

AC ABB62812; 
XX 

DT 26-MAR-2002 (first entry) 
XX 

DE Drosophila melanogaster polypeptide SEQ ID NO 15228. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 

KW pharmaceutical. 

XX 

OS Drosophila melanogaster. 



http.7/es/ScoreAccessWeb/GetItem.action?AppId=l 05525 1 5&seqId=775625&ItemName. . . 11/1 7/2006 



1 1 



XX 

PN WO200171042-A2. 
XX 

PD 27-SEP-2001. 
XX 

PF 23-MAR-2001; 2001WO-US009231 . 
XX 

PR 23-MAR-2000; 2000US-0191637P. 

PR ll-JUL-2000; 2000US-00614150 . 
XX 

PA (PEKE ) PE CORP NY. 
XX 

PI Venter JC/ Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 

DR N-PSDB; ABL06915. 
XX 

PT New isolated nucleic acid detection reagent for detecting 1000 or "more 

PT genes from Drosophila and for elucidating cell signaling and cell-cell 

PT interactions. 
XX 

PS Disclosure; SEQ ID NO 15228; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 

CC capable of detecting 1000 or more genes from Drosophila. The invention is 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell-cell interactions in higher eukaryotes for the development of 

CC insecticides, therapeutics and pharmaceutical drugs. The invention 

CC discloses genomic DNA sequences (ABL1 6176-ABL30511 ) , expressed DNA 

CC sequences ( ABL01840-ABL16175 ) and the encoded proteins (ABB57737- 

CC ABB72072). The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences 

XX 

SQ Sequence 1219 AA; 

Query Match 29.2%; Score 1445; DB 4; Length 1219; 

Best Local Similarity 35.6%; Pred. No. 3.5e-139; 

Matches 342; Conservative 165; Mismatches 332; Indels 122; Gaps 27; 

Qy 35 AERWAMTSETSSGSHCARSRML RRRAQE E D ST VL I D VS P P EAEKRGS Y 82 

I: I : I II 1:1 I :| : . |: I III 
Db 249 ADRVNQSYEVMESSH SNVLPDQFGYRQLI PTERKASDTASSV SGSY 294 

Qy 83 GSTAHASEP GGQQAAACRAGSPAKP RIADFVLVW-EEDLKLDRQ 125 

: I I : II: I : I I I I I I I I I : : 

Db 295 YGSRKASKSNSLGGESGDERRVSKQDREGLDPESLMFRDGRRKVDMVLAWEEEDLGVMTE 354 

Qy 126 QDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQD-VQDGNTTVHYALLSASWAVLCYYAE 184 

:: II I :|::| I II |: :| I I : : I : II 

Db 355 AEAKRRDN RRSFMENLIKEGLEVELEDKSQSFNEKTFFLKIHLPWRLETRLAE 407 

Qy 185 DLRLKLP- LQELPNQASNWSAGLLAWLGIPNVLLEWPDVPP -225 

■ : I I I I I : : I I : : III 
Db 408 VMNLKLPVKRFITISVKPSWDEENWLRNMQYWKDVWQR-LTKKIQLDQTLLE GET 462 

Qy 226 EYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEG 285 

: : I : I : I : I II I : I : : : : I : I I : : : I I : I : : I 

Db 4 63 TFKAATANGNPEEQFIVKD-RATAFTSAQRSLMVMQVLIRTPFDESDRS — GIRRLMNDG 519 

Qy 28 6 VLSAAFPLHDGPFKTPPEGPQAPRLN-QRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVA 34 4 

I I I I : I : : I : :: |: I I : I I I : I I III I I :| I I : I :| 
Db 520 TYLGCFPLHEGRY DRPHSSGISLDRRVLYQTWAHPSQWYKKQPLCLVRKYFGDKIA 575 

Qy 345 L Y FAWLGF YT GWL L P AAWGT LVFLVGCFLVFSD — IPTQELCG — SKDSFEMCPLC-LD 399 

III NMII I: 1111111:1 : I: |::|:| : :llll 

Db 57 6 LYFCWLGFYTEMLVYPAWGTLCFIYGLATLESEDNTPSKEICNEYGTGNITLCPLCDKA 635 

Qy 400 CPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYED 4 59 

I : I I : I :: III: I I I I :: I I : I I I I I I II : I : I I : I 
Db 636 CSYQRLSESCLFSRLTYLFDNPSTVFFAIFMS FWATTFLELWKRKQSVLVWEWDLHNV-D 694 

Qy 4 60 TEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWITOWAVVVMCLVSII 519 

:| 11:1 :| Ihl I III :|: I : :: |::||: :: I 

Db 695 MDEENRPEFETNATTFRMNPVTREKEPYMSTWNRSIRFVITGSAVLFMISWLSAVLGTI 7 54 
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Qy 520 LYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIWSLAHVLTRWEMHRT 579 

III : I : I : I I : I : : : I I I I : I I : : I I : I II I II 
Db 755 LYRITLVSVIYGGGGFFVKEHAKLFTSVTAALINLWIMILTRIYHRMAIKLTNLENPRT 814 

Qy 580 QTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHT L FGVRN EE CAAGG 635 

I : : I I :: I I : I I : I : I I I I I : I I I I I I I I I III: I :: I : I I 

Db 815 HTEYEDSYTFKIFFFEFMNFYSSLIYIAFFKGRFFDYPGDDQARKSEFFRLKNDICDPAG 874 

Qy 636 CLIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDD 695 

I I I I : I : I I I I I I II I I I I II: : I : : I III 
Db 875 CLSELCIQLAIIMVGKQCWNNFMEYLFPKFWNWWR QRKHKQATKDESHLHMAWEQD 930 

Qy 696 YELV-PCE-GLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRR 753 

I : I I I I I I I I I : I I : I I I I : I I I I I I I I I II I I I I I I I I I I I I : I I 
Db 931 YHMQDPGRLALFDEYLEMILQYGFVTLFVAAFPLAP'LFALLNNVAEIRLDAYKMVTQARR 990 

Qy 754 PVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYR — WTRAHDLRGFLNET 811 

I : I I I : I I I I : I I : I : I I : I I I I : : I : : I I |: I I I : :: III:: : 
Db 991 PLAERVEDIGAWYGILRI ITYTAWSNAFVIAYTSDFIPRMVYKFVYSETHTLAGYIEHS 1050 

Qy 812 LA RAPSSFAAAHNRTCRYRAFRDDDGHY SQTYWNLLAI RLAFVIVF 857 

I : : I : I |: I I : I : I I I I : : I I I I I I : I I 

Db 1051 LSIFNTSDYKEEWGASVSEKDPDTCQYRGYRNGPKDYEPYGLSPHYWHVFAARLAFVWF 1110 

Qy 858 EHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELS 917 

Mill: : : : : I I : I I : : : : I I I I I : I : : I I I : : 

Db 1111 EHWFVITGIMQFI IPDVPSEVKTQMQREQLLAKEAKYQ HG I KRAQGDSQD IM 1163 

Qy 918 S 918 

I 

Db 1164 S 1164 



RESULT 12 
ADC4 2854 

ID ADC42854 standard; protein; 910 AA. 

XX, 

AC ADC42854; 
XX 

DT 18-DEC-2003 (first entry) 
XX 

DE REMAP protein. #14. 
XX 

KW Cytostatic; Antiarteriosclerotic; Anti-HIV; Antiinflammatory; 

KW Antiallergic; Antidiabetic; REMAP; pathogenesis. 

XX 

OS Homo sapiens. 
XX 

PN WO2003027228-A2. 
XX 

PD 03-APR-2003. 
XX 

PF 16-JUL-2002; 2002WO-US022833 . 
XX 

PR 17-JUL-2001; 2001US-0306020P . 
PR ' 27-JUL-2001; 2001US-0308 179P . 

PR 02-AUG-2001; 2001US-0309702P . 

PR 10-AUG-2001; 2001US-031 1 476P . 

PR 10-AUG-2001; 2001US-031 1551P . 

PR 10-AUG-2001; 2001US-0311718P . 

PR 24-AUG-2001; 2001US-0314 798P . 

PR 31-AUG-2001; 2001US-0316639P . 

PR 07-SEP-2001; 2001US-0317996P . 
XX 

PA (INCY-) INCYTE GENOMICS INC. 
XX 

PI Lai PG, Honchell CD, Forsythe IJ, Walia NK, Tang TY, Borowsky ML; 

PI Barroso I, Yue H , Warren BA, Thangavelu K, Gietzen KJ, Azimzai Y; 

PI Lee EA, Baughn MR, Gorvad AE, Duggan BM, Tran B, Li JX; 

PI Richardson TW, Elliott VS, Zebarjadian Y, Tran UK, Yao MG; 

PI Peterson DP, Luo W, Lehr-Mason PM; 

XX 

DR WPI; 2003-421156/39. 
XX 

PT New human receptors and membrane-associated proteins (REMAP), useful for 

FT diagnosing, treating or preventing disorders associated with aberrant 
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PT REMAP expression, e.g. cancer, AIDS, atherosclerosis, hypertension or 

PT stroke. 

XX 

PS Claim I; SEQ ID NO 14; 115pp; English. 
XX 

CC The present invention relates to an isolated polypeptide. The 

CC polypeptides and polynucleotides are useful in diagnosing, treating and 

CC preventing disorders associated with aberrant expression of REMAP, . such 

CC as cell proliferative, autoimmune/ inflammatory, renal, neurological, 

CC cardiovascular, metabolic, developmental, endocrine, muscle, 

CC gastrointestinal, lipid metabolism or transport disorders, and viral 

CC infections. These are also useful in assessing the effects of exogenous 

CC compounds on the expression of nucleic acids and amino acid sequences of 

CC REMAP, in facilitating drug discovery process, and in investigating the 

CC .pathogenesis of diseases or medical conditions. Expression and 

CC purification were achieved using bacterial or virus-based expression 

CC systems. The present sequence represents an REMAP protein of the 

CC invention. 

XX 

SQ Sequence 910 AA; 

Query Match 28.3%; Score 1402.5; DB 7; Length 910; 

Best Local Similarity 38.2%; Pred. No. 5.7e-135; 

Matches 322; Conservative 157; Mismatches 286; Indels 79; Gaps 24; 

Qy 108 RIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET FLDNLRAAGLCVD-QQDVQDGNTT 166 

I I I I I I : I : : : :: : : : I I : : II I I : : : I I 
Db 67 RRIDFVLVYED ESRKETNKKGTNEKQRRKRQAYESNLICHGLQLEATRSVLDDKLV 122 

Qy 167 VHYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSA — GLLAWLGIPNVLLEWPDVP 224 

: : I I I II I I I : : I ! I I : II I I : III : I : 

Db 123 — FVKVHAPWEVLCTYAE IMH I KL PLK — PNDLKN RS SAFGTLNWFT KVLSVD ES 1 1 KPE 178 

Qy 225 PEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYG-HEKKNLLGIHQLLA 283 

I :: : I I :: I I : I I I I : I :: I I : : I : I I : : I : ' 

Db 179 QEFFTAPFEKNRMNDFYIVD-RDAFFNPATRSRIVYFILSRVKYQVINNVSKFGINRLVN 237 

Qy 284 EGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKV 34 3 

I : I I I I I I I I : II I I : I : I : : I I I I I I I : I : I : I I I : 

Db 238 SGIYKAAFPLHDCKFRRQSEDPSCP — NERYLLYREWAHPRSIYKKQPLDLIRKYYGEKI 295 

Qy 34 4 ALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELC GSKDSFEMCPLCLD 399 

: I I I I I I : I I I I I I I I I III : : : I : I II III I I 

Db 296 GI YFAWLGYYTQMLLLAAWGVACFLYGYLNQDNCTWSKEVCHPDIGGK — I IMCPQC-D 352 

Qy 400 — CPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDY 4 57 

Mill: I :: : I I I I : |::| I : I I I I : I I I : I I I I I : 
Db 353 RLCPFWKLNITCESSKKLCIFDSFGTLVFAVmGVWTLFLEFWKRRQAELEYEWDTVEL 412 

Qy 458 EDTEERPRPQFAASAPMTAPNPITGEDE — PYFPERSRARRMLAGSWIVVMVAVVVMCL 515 

: I I : I I : : I ■ I II 1 = 1 I : I I II : : : : : 

Db 4i3 QQ-EEQARPEYEARCTHWINEITQEEERI PFTAWGKCI RITLCASAVF- FWI LLI IASV 470 

Qy 516 VSIILYRAIMAIVVSR SGNTLLAAWAS — RIASLTGSWNLVFILILSKIYVSL 567 

: I I : I I : I I I :| : : : |:| |::: : |:||: II : 

Db 471 IGIIVYRLSVFIVFSAKLPKNINGTDPIQKYLTPQTATSITASI ISFI IIMILNTIYEKV 530 

Qy 568 AH VLT RWEMH RTQTKFEDAFTLKVFIFQ FVN F YS S P VY I A FF KG R FVG Y PGN-YHTLFGV .626 

I :: I : 1 : I I I I : I : : I : I : I : I I I I I : I I I I I I I I I I : I I I I I I : : I 
Db 531 AIMITNFELPRTQTDYENSLTMKMFLFQFVNYYSSCFYIAFFKGKFVGYPGDPVYWLGKY 590 

Qy 627 RN E E C AAGGC L I E L AQE L LV I MVGKQV I N NMQEVL I P KL KGWWQ K F RL RS KKRKAG AS AG 686 

I I I I I lllhll : I : M I I : I I : I I U : I : : I I I 

Db 591 RNEECDPGGCLLELTTQLTI IMGGKAIWNNIQEVLLPWIMNLIGRFHRVSGSEKITPR' — 648 

Qy 687 ASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDA 744 

II II I I III I I II I :: I I I I I I : I I I : I II I I I I : I I : I I I : I I 
Db 64 9 WEQDYHLQPMGKLGLFYEYLEMI I QFGFVTLFVAS FPLAPLLALVNN I LEI RVDA 703 

Qy 74 5 RKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWT 799 

I : : I I I I : I I I I I I I : I: I I I : : I I : : I I : I I : I I . I I : 
Db 704 WKLTTQFRRLVPEKAQDIGAWQPIMQGIAILAWTNAMIIAFTSDMIPRLWYWSFSVPP 763 

Qy 800 RAHDLRGFLNFTLARAPSSFAAA HNRTCRYRAFRDDDGH — 838 

:: : |::| I I III :: I I I I I I I II 

Db 764 YGDHTSYTMEGYINNTL S I FKVADFKNKS KGN P YS DLGN HTTCRYRD FRYP PGH PQ 819 
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Qy 839 YSQT YWNLLAI RLAFVI VFEHWFSVGRLLDLLVPDI PESVE I KVKREYYLAKQALA 895 

: : I I : : : I : I I I : I I I I I : : I I : : I I : : : I :: I I I I : : I 
Db 820 EYKHNIYYWHVIAAKLAFIIVMEHVIYSVKFFISYAIPDVSKRTKSKIQREKYLTQKLLH 879 

Qy 896 ENEV 8 99 

II : 

Db 880 ENHL 883 



RESULT 13 
ABB65993 

ID ABB65993 standard; protein; 1075 AA. 
XX 

AC ABB65993; 
XX 

DT 26-MAR-2002 (first entry) 
XX 



DE Drosophila melanogaster polypeptide SEQ ID NO 24771. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 

KW pharmaceutical. 

XX 

OS Drosophila melanogaster. 



XX 

PN WO200171O42-A2. 
XX 

PD 27-SEP-2001. 
XX 

PF 23-MAR-2001; 2001WO-US009231 . 
XX 

PR 23-MAR-2000; 2000US-01 91 637P . 

PR ll-JUL-2000; 2000US-00614 150 . 
XX 

PA (PEKE ) PE CORP NY. 
XX 

PI Venter JC, Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 

DR N-PSDB; ABL10096. 
XX 

PT New isolated nucleic acid detection reagent for detecting 1000 or more 

PT genes from Drosophila and for elucidating cell signaling and cell-cell 

PT interactions. 
XX 

PS Disclosure; SEQ ID NO 24771; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 

CC capable of detecting 1000 or more genes from Drosophila. The invention is 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell-cell interactions in higher eukaryotes for the development of 

CC insecticides, therapeutics and pharmaceutical drugs. The invention 

CC discloses genomic DNA sequences ( ABL1 6176-ABL30511 ) , expressed DNA 

CC sequences (ABL01840-ABL16175 ) and the encoded proteins (ABB57737- 

CC ABB72072) . The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences 

XX 

SQ Sequence 1075 AA; 

Query Match 27.7%; Score 1369.5; DB 4; Length 1075; 
Best Local Similarity 37.4%; Pred. No. 1.9e-131; 

Matches 313; Conservative 163; Mismatches 283; Indels 77; Gaps 20; 

Qy 108 RIADFVLVWEEDLKLDRQQDSAARDRTDMHRT-WRET FLDNLRAAGLCVD — QQ DVQDGN 164 

I i I I I : : I : I : : I I I I I : I I I : hi 
Db 192 RSIDFVLAYRIN AHEPTELENTEKRRVFEANLISQGLEVESSQKD 236 

Qy 165 TTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVL LE 218 

: : : I II Ml |:|::|::|:| : : : : :| : 

Db 237 -QIWFVKIHAPLEVLRRYAEILKLRMPMKEIPGMSWNRSTKSVFSSLKHVFQFFLRNIY 295 

Qy- 219 WPDVPPEYYSCRFRV — NKLPRFLGSDNQDTFFTSTKRHQILFEIL — AKTPYGHEKKN 274 

I : : I : : i I :: : I Mill: I : I : II : I :: 

Db 296 VDEEIFPK-RAHRFTAIYSRDKEYLFDIRQDCFFTTAVRSRIVEFILDRQRFPAKNQHDM 354 
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Qy 


275 


LLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDH 
1 1 : 1 : 1 1 1 1 1 1 1 : 1 1 1 1 1 1 :| :: 1 1 1 1 1 : 1 1 1 1 1 

AFGIERLIAEGVYSAAYPLHDGEITETG TMRALLYKHWASVPKWYRYQPLDD 

• 


334 


Db 


355 


406 


Qy 


335 


VRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGS-KDSFEM 
:: 1 1 1 1 : 1 1 1 1 1 1 1 : 1 1 1 1 1 : : 1 1 : 1 1 1 : : : 1 : : : 1 1 : 1 
IKEYFGVKIGLYFAWLGYYTYMLLLASIVGVICFLYSWFSLKNYVPVKDICQSGNTNITM 


393 


Db 


407 


466 


Qy 

Db 


394 
467 


CPLCLDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWD 
Mi! 1 il 1 II: II: llll::||: 1 1 1 1 1 1 1 1 1 1 : : 1 1 1 
CPLCDWCNFWDLKETCNYAKVTYLIDNPSTVFFAVFMSFWATLFLELWKRYSAEITHRWD 


453 
526 


Qy 


454 


CSDYEDTEERPRPQFAA SAPMTAPNPITGEDEPYFP-ERSRARRMLAGSWIWMVA 


509 


Db 


527 


: :: II Mil: 1 1 1 : :| INI: : |:::::| 
LTGFDVHEEHPRPQYLARLEHIPPTRVDYVTNIKEPTVPFWRMKLPATVFSFSWLLLIA 


586 


Qy 


510 


VWMCLVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVSLAH 
: : 1 :: :: : I I 1 : : : : 1 :| : : : 1 1 1 : 1 1 : : 1 II 
I^FVALIAVVVYRMSMI^LKVGASPMTTSSAIVLATASAAFVNLCLLYILNYMYNHLAE 


569 


Db 


587 


646 


Qy. 

Db 


570 
647 


VLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNE 
II II 1 1 1 1 : 1 : 1 : III::: 1 1 1 1 :.l : 1 1 1 11 II 1 : 1 1 1 :| 1 1 : II 1 1 
YLTELEMWRTQTQFDDSLTLKIYLLQFVNYYASIFYIAFFKGKFVGHPGEYNKLFDYRQE 


.629 
706 


Qy 


630 


EC AAGGCL I ELAQE LL V I MVGKQV I N NMQE VL I P KL KGWWQK FRLRSKKRKAGAS AGAS Q 
1 1 :: 1 1 1 1 1 1 : 1 : 1 1 1 1 1 1 1 : 1 1 : 1 : 1 : 1 : 1 1 : 
ECSSGGCLTELCIQLAIIMVGKQAFNTILEVYLPM FWRKV LAIQVGLSRLFNN 


689 


Db 


707 


759 


Qy 


690 


GP WEDDYELVP — CEGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWV 

1 1 1 : : 1 : 1 1 1 1 1 1 1 1 1 N : 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 
TPNPDKAKDERWMRDFKLLDWGARGLFPEYLEMVLQYGFVTIFVAAFPLAPFFALLNNIL 


738 


Db 


760 


819 


Qy 


739 


EIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW 
1 : 1 1 1 1 : 1 : : : 1 1 1 : : 1 : 1 1 1 : 1 : 1 1 : 1 : 1 1 : 1 1 : : 1 1 : 1 1 : II II 
EMRLDAKKLLTHHKRPVSQRVRDIGVWYRILDCIGKLSVITNGFIIAFTSDMIPRLVYRH 


798 


Db 


820 


879 


Qy 


799 


— TRAHDLRGFLNFTLAR APSSFAAAHN RTCRYRAFR — DDDGHYSQT 

: 1 |:lilll: :|: :: | : Mill) : 1 
YVNKQGTLDGYLNFTLSEFKVIDSPTLYSLAGDLSNITTCRYTDFRLPPSSPEKYTLSSM 


842 


Db 


880 


939 


Qy 


84 3 


YWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENE 8 98 

:: :|| II 11:111: 1 1 1: :l|: : :::|| |: : : : 1 

FY 1 1 LACRLGFWVFEN FVALVMI LVRWC I PDMS VELRDQ I RREVYVT NE 1 1 1 DQE 995 




Db 


94 0 





RESULT 14 
AEG11145 

ID AEG11145 standard; protein; 712 AA. 
XX 

AC AEG1114 5; 
XX 

DT 20-APR-2006 (first entry) 
XX 

DE Human transmembrane protein 16A f SEQ ID NO: 10. 
XX 



KW Genetic marker; diagnostic; prognosis; gastrointestinal tumor; 

KW cytostatic; neoplasm; tumor marker; transmembrane protein 16A. 
XX 

OS Homo sapiens. 



XX 

PN US2O0604O292-A1. 
XX 

PD 23-FEB-2006. 
XX 

PF 08-JUL-2005; 2005US-001778 94 . 
XX 

PR 08-JUL-2004; 2004US-058667 6P . 
XX 

PA (WEST/) WEST R B. 

PA (VRIJ/) VAN DE RIJN M. 

XX 

PI West RB, Van De Rijn M; 
XX 

DR WPI; 2006-182760/19. 

DR N-PSDB; AEG11140. 

DR GEN BANK; AAH27590. 
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XX 

PT Classifying tumor as gastrointestinal stromal tumor belonging to PDGFRA 

PT positive subclass, involves detecting expression or activity of gene 

PT encoding D0G1 polypeptide in sample. 
XX 

PS Disclosure; SEQ ID NO 10; 177pp; English. 
XX 

CC The present invention relates to three gene markers such as DOG1, KIT and 

CC platelet derived-growth factor receptor alpha { PDGFRA) that are useful in 

CC classifying tumors. These gene markers are useful in the classification 

CC of gastrointestinal stromal tumors (GISTs) and tumors other than GISTs. 

CC The invention also relates to methods providing diagnostic, prognostic 

CC and predicative information based on the classifying step. The invention 

CC is useful for classifying gastrointestinal stromal tumors as belonging to 

CC a PDGFRA positive subclass, KIT negative or PDGFRA negative subclass. The 

CC present sequence is human transmembrane protein 16A (D0G1; TMEM16A) . The 

CC DOG1 gene encodes a transmembrane protein of unknown function 

CC (transmembrane protein 16A) . The transmembrane protein 16A is encoded by 

CC D0G1 gene that is mapped to llql3.2 on chromosome 11. 
XX 

SQ Sequence 712 AA; 

Query Match 27.6%; Score 1367.5; DB 10; Length 712; 

Best Local Similarity 41.6%; Pred. No. 1.6e-131; 

Matches 299; Conservative 128; Mismatches 220; Indels 71; Gaps 17; 



Qy 259 LFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQ 318 

: : I I I : I I : : I I I I I I I : I I : I I I I I : : I I : : I :: 

Db 2 VYEILKRTTCTKAKYS-MGITSLLANGVYAAAYPLHDGDY NGENVEFNDRKLLYE 55 

Qy 319 HWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSD 378 

I I I : I : 1111:1 11:11111: 1 I I I I I I II I : I I : : I I : I I I I I : : 

Db 56 EWARYGVFYKYQP I DLVRKY FGEKIGLYFAWLGVYTQML I PAS I VGI I VFLYGCATMD EN 115 



Qy 379 IPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLL 437 

I I : I : I : : I I I I I I :| : I I I I I I: I III: I I I I I : I I I I I I 

Db 116 IPSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVFMALWAATF 175 

Qy 438 LEYWKRKSATLAYRWDCSDYEDTEE RPRPQFAA SAPMTAPNPITGEDEPYF 4 88 

: I : I II I I I I I I : : I : I I I I : : I I : I I I : 

Db 176 MEHWKRKQMRLNYRWDLTGFEEEEEAVKDHPRAEYEARVLEKSLKKESRNKET — DKVKL 233 

Qy 489 PERS RARRMLAGS WI WMVAWVMCLVS 1 1 L YRAIMA I WS RSGNTLLAAWAS RI ASLT 548 

II I I I : I : I I :: :|:|| || :: : : : : : | 

Db 234 TWRDRFPAYLTNLVS 1 1 FMI AVTFAI VLGVI I YRI SMAAALAMNSS PSVRSN I RVT VTAT 293 

Qy 54 9 GSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAF 608 

: : I I I I : : I : : I : I I I : i : : I : II: I I : : I I I I : I : I I 

Db 294 AVIINLWIILLDEVYGCIARWLTKIEVPKTEKSFEERLIFKAFLLKFVNSYTPIFYVAF 353 

Qy 609 FKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELAQELLVIMVGKQVI-NNMQEVLIPKLK 666 

111111111:1 : I 111111111:11 : I : I I : I I I : I I I : I : II I : I 

Db 354 FKGRFVGRPGDYVYI FRS FRMEECAPGGCLMELCIQLSI IMLGKQLIQNNLFEIGI PKMK 413 

Qy 667 GWWQ KF RL RS KKRKAGAS AGAS QG PWE D D Y EL VP CE GL FD E Y L EMV LQ FG FVT I FVAAC P 726 

: : ! : : : : I I I I I II I I : I I : : I I I I I I : I I I : I 

Db 414 KLIRYLKLKQQSPPDHEECVKRKQRYEVDYNLEPFAGLTPEYMEMI IQFGFVTLFVASFP 4 73 

Qy 727 LAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAF 786 

• I I I I I I I I I I : I II I I I : I I I I I I I I I I I : I I I I I :: I I I : I I I I I I I : : : I 

Db 474 LAPLFALLNN 1 1 EI RLDAKKFVTELRRPVAVRAKDI GI WYNI LRGIGKLAVI INAFVI SF 533 

Qy 787 SSDFLPRA— YYRWTRAHDLRGFLNFTLARAPSSF AAAHN RTCR 828 

: 1 I I : I I I : : : : I I : I I I ill II : I I 

Db 534 TSDFIPRLVYLYMYSKNGTMHGFVNHTL SSFNVSDFQNGTAPNDPLDLGYEVQICR 589 

Qy 829 YRAFRD DDGHY — SQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDI PESVEIKV 883 

I: :|: : I |: :| :|| llllllll:::| : :| ::||||: : :: 

Db 590 YKD Y RE P P WS EN KY D I S KD FWAVL AARL AFV I VFQN LVMFMS D FVD WV IPDIPKDISQEI 649 

Qy 884 KREYYLA KQALAENEVLFGTNGTKDEQP KGSELSSH 919 

: I I Mil. I I I I I I I I 

Db 650 HKEKVLMVELFMREEQDKQQLL — ETWMEKERQKDEPPCNHHNTKACPDSLGSPAPSH 705 



RESULT 15 
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ABB65022 

ID ABB65022 standard; protein; 1058 AA. 
XX 

AC ABB65022; 
XX 

DT 26-MAR-2002 (first entry) 
XX 



DE Drosophila melanogaster polypeptide SEQ ID NO 21858. 
XX 

KW Drosophila; developmental biology; cell signalling; insecticide; 

KW pharmaceutical. 

XX 

OS Drosophila melanogaster. • 
XX 

PN WO200171042-A2. 
XX 

PD 27-SEP-2001. 



XX 

PF 23-MAR-2001; 2001WO-US009231 . 
XX 

PR 23-MAR-2000; 2000US-01 91 637P . 

PR ll-JUL-2000; 2000US-00614150 . 
XX 

PA { PEKE ) PE CORP NY. 
XX 

PI Venter JC, Adams M, Li PWD, Myers EW; 
XX 

DR WPI; 2001-656860/75. 

DR N-PSDB; ABL09125. 
XX 



PT New isolated nucleic acid detection reagent for detecting 1000 or more 

PT genes from Drosophila and for elucidating cell signaling and cell-cell 

PT interactions. 
XX 

PS Disclosure; SEQ ID NO 21858; 21pp + Sequence Listing; English. 
XX 

CC The invention relates to an isolated nucleic acid detection reagent 

CC capable of detecting 1000 or more genes from Drosophila. The invention is 

CC useful in developmental biology and in elucidating cell signalling and 

CC cell-cell, interactions in higher eukaryotes for the development of 

CC insecticides, therapeutics and pharmaceutical drugs. The invention 

CC discloses genomic DNA sequences ( ABL1 6176-ABL30511 ) , expressed DNA 

CC sequences ( ABL01840-ABL1 6175 ) and the encoded proteins (ABB57737- 

CC ABB72072) . The sequence data for this patent did not form part of the 

CC printed specification, but was obtained in electronic format directly 

CC from WIPO at ftp.wipo.int/pub/published_pct_sequences 

XX 

SQ Sequence 1058 AA; 



Query Match 24.2%; Score 1199.5; DB 4; Length 1058; 

Best Local Similarity 33.6%; Pred. No. 7.7e-114; 

Matches 294; Conservative 155; Mismatches 280; Indels 147; Gaps 26; 

Qy 108 R I AD FVL VWE ED LKL D RQQD S AARDRT DMH RT WRET FL DN L RAAGL CVDQQD VQ D GNT T V 167 

I MM: : : | : j| I I II: II :: || | 
Db 204 RSVDFVLAYNGETQLEE HRRKCE I FEANLQREGLQLEHN KVQ RV 247 

Qy 168 HYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEY 227 

I: : I II Ml |::|:||:.:| I :: : |: 

Db 248 HFIKIHAPAEVLYRYAEILKIKVPLKPIPGQD QIFAESAHEF 289 

Qy 228 YSCRFRVNKL PRF LGSDNQDTFFTSTKRHQILFE 261 

:| I: I II I | :: |: |: 

Db 290 KTCFSRMCKSLFSSVQLNTALFPEREPRIHLEFSRNYLELYDTEHPNFLDASTRYSIINF 349 

Qy 262 ILAKTPY — GHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQ RQ 314 

I I : : II : M I : I : : M : I : M I : : : I I : 

Db 350 ILQRQRFVEGEETADNLGIEKLVQDGVYTCAYTLHDVERRSRSAAKGVGQHIQVEEQLRE 4 09 

Qy 315 VL FQ HWARWG KWN K YQ P L D H VRRY FG E KVA L Y FAWL G F YT GWL L P AAWGT L V F L VGC FL 374 

I : M I I :: II I M I II I II II I I I |: I :| I I II I 

Db 410 TLKPFYC SLQPLDQIKDYFGAKVALYFAWLGFYTQMLIPISVFGVLCFLYGFIT 4 63 

Qy 375 VFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALW 4 33 

II :: :: I : III I I : I I : I :: I I : II I : I I : I 
Db 4 64 WNSDPISRDICDDNGTI-MCPQCDRSCDYWRLNETCTSSKFNYLIDNNMTWFAFSMAIW 522 
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Qy 


434 


AVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSR 
1 1 : 1 1 : 1 i 1 1 1 1 : 1 1 : : 1 1 1 1 1 : 1 ' 1 : 1 : I : : 
AVVYLEFWKRYSAGLVHRWGLTGFTHHVEHPRPQYLARISRT — KKLAG — KAYEQDHTG 


493 


Db 


523 


578 


Qy 


.494 


ARRMLAGSV VIWMVAVWMCLVSI ILYRAIMAIWSRSGNTLLA 


538 


Db 


579 


1 : 1 1 :: | : : : | : : | | : | I : | : : :: | 
KRTILDPDVPFWSFKFLPNFTSYSIMVLFICISVIAIAGI IIYR MAQRASHSILG 


633 


Qy 


• 539 


AWAS RIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFI 

: 1 1 :| :::|: 1 :| :| :|| II :| 1 1 I I : : : : : I : I : : 
SENSMTFKVMILPMTAGIIDLIVISLLDMVYSNLAVKLTNYEYCRTQTEYDESLTIKNYV 


593 


Db 


634 


693 


Qy 

Db 


594 
694 


FQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQV 
11111:111 1 1 1 1 11:11111 1 : : 1 I III 1111 = 11 : I : : I I III 
FQFVNYYSSLFYIAFLKGKFVGYPAKYNRVLGFRQEECNPGGCLMELCMQLVIIMAGKQA 


653 
753 


Qy 


654 


INNMQEVLIPKL KGWW QKFRLRSKKRKAGASAGASQGPWEDDYELVP 


700 


Db 


754 


: 1 : 1 : j 1 1 1 1 I : III : : 1 1 1 :| 
VNAIVEMLIPYLMRTFKELSYRHGWYKSHQDQRL VPYNQFTEDYNLLP 


801 


Qy 


701 


CE — GLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAER 
1 1 : II 1 1 1 1 : 1 1 1 1 : 1 : 1 1 1 1 1 1 1 1 1 1 M : 1 : 1 1 1 1 1 : MM 1 
AENNSLYVEYLEMWQFGFITLFSLAFPLAPLLALLNNVIEVRLDAIKMLRFLRRPVGMR 


758 


Db 


802 


861 


Qy 


759 


AQD I G I WF H I LAGLT H LAV ISNAFLLAFSSDFLP RA YY RWT RAH - DL RG FL N FT LA 


813 


Db 


862 


IMIhl 1: : 1 : 1 1 1 : 1 : : 1 II : : : 1 : 1 : :| MMMI 
ARDIGVWHSIMTWTRIAVASSAMIIAFSTNLIPKIVYAASMGDPELNNYLNFTLAVFNT 


921 


Qy 


814 


RAPSSFAAAH-NRT-CRYRAFRD — DDGH-YSQ — TYWNLLAIRLAFVI VFEHWF 

KDFQVQPLLGGSQHVNETVCRYTEFRNSPEDPHPYKRPMIYWKILTGRLAFIVIYQNI IT 


862 


Db 


922 


981 


Qy 


863 


SVGRLLDLLVPDI PESVE I KVKREYYLAKQALAENE 898 • 

: :| III: : : : I I I : I : : : I I 
MLQGILRWAVPDVSGRLLKRIKRENFLLREHI IEYE 1017 




Db 


982 





Search completed: October 27, 2006, 20:23:21 
Job time : 214 sees 
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SCORE Search Results Details for Application 
10552515 and Search Result us-10-552-515- 

l.rai. 



Score Home 
Page 



Retrieve Application 
List 



SCORE System 
Overview 



SCORE 
FAQ 



Comments / 
Suggestions 



This page gives you Search Results detail for the Application 10552515 and Search Result us-10- 

552-515-1. rai. 

start 

Go Back to previous page 

GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



October 27, 2006, 20:29:09 



US-10-552-515-1 
4950 

1 MRMAATAWAGLQGPPLPTLC. 



; Search time 54 Seconds 

(without alignments) 

1512.335 Million cell updates/sec 



. SELSSHWTPFTVPKASQLQQ 933 



BLOSUM62 
Gapop 10.0 



Gapext 0.5 



650591 seqs, 87530628 residues 



Total number of hits satisfying chosen parameters: 



650591 



Minimum DB seq length: 
Maximum DB seq length: 



2000000000 



Post-processing : 



Database 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Issued_Patents_AA: * 

1: /EMC_Celerra_SIDS3/ptodata/2/iaa/5_COMB.pep:*' 

2: /EMC_Celerra_SIDS3/ptodata/2/iaa/6_COMB.pep:* 

3: /EMC_Celerra_SIDS3/ptodata/2/iaa/7_COMB.pep:* 

4: /EMC_Celerra_SIDS3/ptodata/2/iaa/H_COMB.pep:* 

5: /EMC_Celerra_SIDS3/ptodata/2/iaa/PCTUS_COMB.pep:* 

6: /EMC_Celerra_SIDS3/ptodata/2/iaa/RE_COMB.pep:* 

7: /EMC_Celerra_SIDS3/ptodata/2/iaa/backfilesl.pep:* 

Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 
No. 



Score 



% 

Query 

Match Length DB 



ID 



Description 



1 


1531 .5 


30. 


9 


920 


2 


US- 


10 


-104 


-047 


-2574 


Sequence 


2574, Ap 


2 


1154 


23. 


3 


596 


2 


US- 


10 


-104 


-047 


-2541 


Sequence 


2541, Ap 


3 


912.5 


18. 


4 


475 


2 


US- 


10 


-104 


-047 


-3116 


Sequence 


3116, Ap 


4 


796 


16. 


1 


425 


2 


US- 


09 


-270 


-767 


-45552 


Sequence 


45552, A 


5 


396.5 


8. 


0 


215 


2 


us- 


09 


-270 


-767 


-61064 


Sequence 


6i064, A 


6 


353 


7 . 


1 


366 


2 


us- 


09 


-270 


-767 


-32253 


Sequence 


32253, A 


7 


353 


7. 


1 


366 


2 


us- 


09 


-270 


-767 


-47470 


Sequence 


47470, A 


8 


290 


5. 


9 


189 


2 


us- 


09 


-270 


-767 


-31816 


Sequence 


31816, A 


9 


290 


5'. 


9 


189 


2 


us- 


09 


-270 


-767 


-47033 


Sequence 


47033, A 


10 


255.5 


5. 


2 


199 


2 


us- 


09 


-270 


-767 


-31722 


Sequence 


31722, A 
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A I 



11 


255 .5 


5 


2 


199 


2 


US-09 -2 7 0-7 67 -4 6939 


Sequence 


46939, A 


12 


186 . 5 


3 


8 


1 66 


2 


US-09 -62 1-976-4064 


Sequence 


4064, Ap 


13 


117 


2 


4 


548 


1 


US-08-67 6-279-50 


Sequence 


50 , Appl 


14 


117 


2 


4 


548 


2 


US-08-903-139B-8 


Sequence 


8, Appli 


15 


117 


2 


4 


548 


2 


US- 08 -637-82 3B- 2 5 


Sequence 


25, Appl 


16 


117 


2 


4 


548 


2 


US-09-614-957D-25 


Sequence 


25, Appl 


17 


115.5 


2 


3 


2013 


1 


US- 08 -324- 97 7-12 


Sequence 


12, Appl 


18 


115.5 


2 


3 


2013 


1 


US-08-384-616-12 


Sequence 


12, Appl 


19 


115 . 5 


2 


3 


2013 


1 


US-08-904-686A-12 


Sequence 


12, Appl 


20 


115.5 


2 


3 


2013 


2 


US-09 -3 15-850-12 


Sequence 


12, Appl 


21 


115.5 


2 


3 


3010 


1 


US-08-324-977-2 


Sequence 


2, Appli 


22 


115.5 


2 


3 


3010 


1 


US-08-324-977-14 


Sequence 


14, Appl 


23 


115.5 


2 


3 


3010 


1 


US-08-384-616-2 


Sequence 


2, Appli 


24 


115.5 


2 


3 


3010 


1 


US-08-384-616-14 


Sequence 


14, Appl 


25 


115.5 


2 


3 


3010 


1 


US-08-904-686A-2 


Sequence 


2, Appli 


26 


115.5 


2 


3 


3010 


1 


US-08-904-686A-14 


Sequence 


14, Appl 


27 


115.5 


2 . 


3 


3010 


2 


US-09-315-850-2 


Sequence 


2, Appli 


28 


115.5 


2 . 


3 


3010 


2 


US-09-315-850-14 


Sequence 


14, Appl 


29 


110 .5 


2 . 


2 


680 


2 


US-09-725-735A-19 


Sequence 


19, Appl 


30 


108 .5 


2 . 


2 


523 


2 


US-09-94 9-016-11540 


Sequence 


11540, A 


31 


108 .5 


2 . 


2 


578 


2 


US-09-052-753B-7 


Sequence 


7, Appli 


32 


106 


2 . 


1 


539 


2 


US-09-24 8-796A-1654 2 


Sequence 


16542, A 


33 


105 


2. 


1 


1089 


2 


US-10-012-231A-102 


Sequence 


102, App 


34 


105 


2 . 


1 


1089 


2 


US- 10-01 5-38 9A- 102 


Sequence 


102, App 


35 


105 


2 . 


1 


1089 


2 


US-10-006-768A-102 


Sequence 


102, App 


36 


105 


2 . 


1 


1089 


2 


US- 10-01 5-67 1A- 102 


Sequence 


102, App 


37 


105 


2 . 


1 


1089 


2 


US-10-015-393A-102 


Sequence 


102, App 


38 


105 


2 . 


1 


1089 


2 


US- 10-01 1-833A- 102 


Sequence 


102, App 


39 


105 


2 . 


1 


1089 


2 


US- 10-006-04 1A- 102 


Sequence 


102, App 


40 


105 


2 . 


1 


1089 


2 


US- 10-01 2-064A- 102 


Sequence 


102, App 


4 1 


105 


2 . 


1 


1089 


2 


US- 10-01 5-392A- 102 


Sequence 


102, App 


42 


105 


2. 


1 


1089 


3 


US-10-011-795B-102 


Sequence 


102, App 


43 


105 


2. 


1 


1089 


3 


US-10-015-386A-102 


Sequence 


102, App 


44 


105 


2. 


1 


1089 


3 


US-10-012-121A-102 


Sequence 


102, App 


45 


105 


2. 


1 


1089 • 


3 


US-10-006-485A-102 


Sequence 


102, App 



ALIGNMENTS 



RESULT 1 

US-10-104-047-2574 

Sequence 2574, Application US/10104047 
Patent No. 6943241 
GENERAL INFORMATION : 
APPLICANT: HELIX RESEARCH INSTITUTE 
TITLE OF INVENTION: No. 6943241el full length cDNA 
FILE REFERENCE: H1-A0105 

CURRENT APPLICATION NUMBER: US/ 10/104 , 047 
CURRENT FILING DATE: 2002-03-25 
PRIOR APPLICATION NUMBER: 
PRIOR FILING DATE: 
NUMBER OF SEQ ID NOS : 4096 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 2574 
LENGTH: 920 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-104-047-2574 

Query Match 30.9%; Score 1531.5; DB 2; Length 920; 

Best Local Similarity 37.9%; Pred. No. 2e-157; 

Matches 360; Conservative 168; Mismatches 316; Indels 105; Gaps 29; 

Qy 4 4 TSSGSHCARSRMLRRRAQEEDSTVLID VSPPEAE KRGSYGST AHASEP 91 

: I I I : : : : I : I : I I III : I : II I ' I 

Db 4 SSSGITNGKTKVFHPVA— KDVNILFDELEAVSSPCKDDDSLLHPGNLTSTSDDASRLEA 61 

Qy .92 GGQQAAAC RAGS PAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET FLD 14 6 

II: : | | | | :: I I : : : : I : II I 

Db 62 GGETVPERNKSNGLYFRDGKCRI-DYILVYRK SNPQTEK REVFER 105 

Qy 14 7 NLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQE LPNQASNW 202 

I : I I I ! :::: I : : : I I I I I I I I : ::: I : I I : 

Db 106 N I RA E GLQME KESSLI-NSDIIFVKLHAPWEVLGRYAE QMN VRM PFRRKIYYLPRRYKFM 164 
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1 1 



Qy 


203 


S AGLLAWLGIPNVLL — EWPDVPP-EYYSCRFRVNKLPRFLGSDNQDTFFTST 

1 : 1 II : 1 1 :M: : |: 1 :: I: |::||| : 
SRIDKQISRLRRWLPKKPMRLDKETLPDLEENDCYTAPFSQQRIHHFI-IHNKETFFNNA 


253 


Db 


165 


223 


Qy 


254 


KRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQR 
1:1: II: 1 1 II : 1 : : : M 1 1 1 1 1 1 1 : 1 :: : I I 
TRSRIVHHILQRIKY-EEGKNKIGLNRLLTNGSYEAAFPLHEGSYRSKNSIRTHGAENHR 


313 


Db 


224 


282 


Qy 

Db 


314 
283 


QVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCF 

: 1 : : 11 II 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 : 1 1 1 1 III : 1 III 1 
HLLYECWASWGVWYKYQPLDLVRRYFGEKIGLYFAWLGWYTGMLFPAAFIGLFVFLYGVT 


373 
342 


Qy 


374 


LVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMAL 
: ::|:| : I |||:| ||| || :| |: 1 1 1 : 1 1 1 I 1 : : 1 1 I : 
TLDHSQVSKEVCQATDII -MCPVCDKYCPFMRLSDSCVYAKVTHLFDNGATVFFAVFMAV 


432 


Db 


34 3 


401 


Qy 


433 


WAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAAS-APMTAPNPITGEDEPYFPER 


491 


Db 


402 


1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 : 1 : 1 1 1 1 1 1 1 : 1 1 1 : 1 : 1 1 1 
WATVFLEFWKRRRAVIAYDWDLIDWEEEEEEIRPQFEAKYSKKERMNPISGKPEPYQAFT 


461 


Qy 


492 


SRARRMLAGSWIWMVAVWMCLVSIILYRAIMAIWSRSGNTLLA-AWA SRIA 


545 


Db 


462 


: 1 : : : 1 1 : 1 1 : : 1 : : 1 1 : : 1 1 II 1 : : 1 
DKCSRLIVSASGIFFMICWIAAVFGIVIYRWTV ST FAAFKWAL I RNN SQVA 


514 


Qy 


54 6 


SLTGSW--NLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSP 
: 1 1 : 1 1 1 :: 1 : : 1 : 1 : 1 1 1 1 1 : : : : 1 : : II 1 1 : 1 : 1 1 1 1 1 II 
T-TGTAVCINFCI IMLLNVLYEKVALLLTNLEQPRTESEWENSFTLKMFLFQFVNLNSST 


603 


Db 


515 


573 


Qy 


604 


VYIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLI 

111111111:111 1 1 1 1 1 1 1 1 1 : 1 :: : 1 1 1 1 1 II 1 : 
FYIAFFLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGI IMVLKQTWNNFMELGY 


662 


Db 


574 


633 


Qy 

Db 


663 
634 


PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE-- GLFDEYLEMVLQFGFVTI 
1 :: II : 1 •: : : 1 " 1 II II 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 
PLIQNWWTR RKVRQEHGPERKISFPQWEKDYNLQPMNAYGLFDEYLEMILQFGFTTI 


720 
690 


Qy 

Db 


721 
691 


FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 
Mil Mill Mill :llllll III ::ll|:| Ihlllll: II |: |:||:| 
FVAAFPLAPLLALLNN 1 1 EIRLDAYKFVTQWRRPLASRAKDIGIWYGILEGIGILSVITN 


780 
750 


Qy 


781 


AFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA RAP 

II :": 1 :| 11:11 1 : : |::| :|: 

AFVIAITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRISDFENRSEPESDG 


816 


Db 


751 


810 


Qy 


817 


SSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLL 

II: : 1 1 1 1 : 1 1 1 : : 1 : : 1 1 1 1 1 1 : 1 1 1 1 1 : 1 1 : I : 1 
S E FSGT PL KYCRYRD YRD P P H S LVPYGYT L QFWH VLAARL AF 1 1 VFEHLVFC I KHL I S YL 


871 


Db 


811 


870 


Qy 


872 


VPDIPESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELSSHW 920 
:||:|: : :::|| II :: : 1 I: |: : 1 : I 
IPDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGKAHHNEW 919 




Db 


871 




RESULT 2 
US-10-104 


-047- 


-2541 





; Sequence 2541, Application US/10104047 
.; Patent No. 6943241 
; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: No. 6943241el full length cDNA 

; FILE REFERENCE: H1-A0105 

; CURRENT APPLICATION NUMBER: US/ 10/104 , 047 

; CURRENT FILING DATE: 2002-03-25 

; PRIOR APPLICATION NUMBER: 

; PRIOR FILING DATE: 

; NUMBER OF SEQ ID NOS: 4 096 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 2541 

LENGTH: 596 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-104-047-2541 

Query Match • 23.3%; Score 1154; DB 2; Length 596; 

Best Local Similarity 41.3%; Pred. No. 1.9e-116; 

Matches 250; Conservative 108; Mismatches 194; Indels 54; Gaps 14; 
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Qy 357 LLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAG 415 

I I I I : I I I I I : : : I : I : | I I I : I III I I : I | : 

Db 2 LFPAAFIGLFVFLYGVTTLDHSQVSKEVCQATDII-MCPVCDKYCPFMRLSDSCVYAKVT 60 

Qy 416 RLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAAS-APM 474 

I I I : I I I I I : : I I I : I I : I I : I I I : I : I I I I |: I : I I I I I I I : 
Db 61 HLFDNGATVFFAVFMAVWATVFLEFWKRRRAVIAYDWDLIDWEEEEEEIRPQFEAKYSKK 120 

Qy 475 TAPNPITGEDEPYFPERSRARRMLAGSWIWMVAVWMCLVSI ILYRAIMAIWSRSGN 534 

III : I:: : I |: II: : |::|| : : 
Db 121 ERMNPiSGKPEPYQAFTDKCSRLIVSASGIFFMICWIAAVFGIVIYRWTV S 173 

Qy 535 TLLA-AWA SRIASLTGSW — NLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDA 586 

I I II II: I I |::|: :| :| :|| I ||::::|:: 

Db 174 TFAAFKWALIRNNSQVAT-TGTAVCINFCIIMLLNVLYEKVALLLTNLEQPRTESEWENS 232 

Qy 587 FTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGCLIELAQELL 645 

I I I I : I : I I I I I II I I I I I I I I I : I I I I till I I I I : I : : 
Db 233 FT L KMFL FQFVN LN S STFY I AF FLGRFT GH PGAYLRL I N RWRLE EC H P SGCL I D LCMQMG 292 

Qy 64 6 VIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE--G 703 

: I I I I I II I : I : : I I : I :: : I I I I I I I I I 

Db 293 IIMVLKQTWNNFMELGYPLIQNWWTR RKVRQEHGPERKISFPQWEKDYNLQPMNAYG 34 9 

Qy 704 LFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIG 763 

I I I I I I I I : I I I I I I I I I I I I I I I I I I I I I : I I I I I I III : : I I I : I I I : I I I 
Db 350 LFDEYLEMI LQFGFTT I FVAAFPLAPLLALLNNI IE IRLDAYKFVTQWRRPLASRAKD IG 409 

Qy 764 IWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA 813 

I I : I I I : I : I I : I I I : : I : I I I : I I I : : |:: I : |: 

Db 410 IWYGILEGIGILSVITNAFVIAITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLS 4 69 

Qy 814 RAPSSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFV 854 

II: : I I I I : I I I : : I :: I I I I I I : 

Db 470 VFRISDFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQFWHVLAARLAFI 529 

Qy 855 I V FE HWF S VGRL L DL L V P D I P E S VE I KVK RE Y Y LAKQ AL AE N E VL FGT N GT KD EQ P KGS 914 

11111:11: I : I : I I : I : : : : : I I I I : : : I I : I: : I 

Db 530 IVFEHLVFCIKHLISYLIPDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGK 589 

Qy 915 ELSSHW 920 

: I 

Db 590 AHHNEW 595 



RESULT 3 

US-10-104-047-3116 

/Sequence 3116, Application US/10104047 
; Patent No. 6943241 
; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: No. 6943241el full length cDNA 

; FILE REFERENCE: H1-A0105 

; CURRENT APPLICATION NUMBER: US/10/104,047 

; CURRENT FILING DATE: 2002-03-25 

; PRIOR APPLICATION NUMBER: 

; PRIOR FILING DATE: 

; NUMBER OF SEQ ID NOS: 4096 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 3116 

LENGTH: 47 5 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-104-047-3116 

Query Match 18.4%; Score 912.5; DB 2; Length 475; 

Best Local Similarity 38.0%; Pred. No. 3.4e-90; 

Matches 202; Conservative 89; Mismatches 143; Indels 97; Gaps 12; 

Qy 4 30 MALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFP 4 89 

I I : M : I I : I I I : I : I I I I I : I : I 
Db 1 MAVWATVFLEFWKRRRAVIAYDWDLI DWEEEE 32 

Qy • 490 ERSRARRMLAGSWIWMVAVVVMCLVSIILYRAIMAIVVSRSGNTLLA-AWA SR 543 

: I I : : I : : I I : : I I II I : 
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33 ICWIAAVFGIVIYRWTV STFAAFKWALIRNNSQ 67 

54 4 IASLTGSW — NLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYS 601 
: I : II: I I I : : I : : I : I : I I I I I ::::(:: I I I I : I : I M I I I 
68 VAT-TGTAVCINFCIIMLLNVLYEKVALLLTNLEQPRTESEWENSFTLKMFLFQFVNLNS 126 

602 SPWIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGCLIELAQELLVIMVGKQVINNMQEV 660 

I" I I I I I I I I I :! I I I III! I I I I : I : : : I I I I I I I I : 
127 STFYIAFFLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGI IMVLKQTWNNFMEL 186 

661 LIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFV 718 

I :: I I : I ::: I I II II I I I I I I I I I I I : I I I I I 

187 GYPLIQNWWTR RKVRQEHGPERKISFPQWEKDYNLQPMNAYGLFDEYLEMI LQFGET 243 

719 TIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVI 778 

I I I I I I I I I I I I I I I I : I I I I I I . I I I : : I I I : I I I : I I I I h II I : CM 
244 TI FVAAFPLAPLLALLNN 1 1 EI RLDAYKFVTQWRRPLASRAKDIGI WYGI LEGI GI LSVI 303 

77 9 SNAFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA R 814 

: I I I : : I : I I I : I I I : : | :: | : | : 

304 TN AFVI A I T S D F I P RL VY AYKYGP CAGQGE AGQKCMVG YVN AS L SV FR I S D FEN RS EP ES 363 

815 APSSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFVIVFEHWFSVGRLLD 869 

II: : I I I I : I I . I : : I : : I I I I I I : I I I I I : I I : I : 

364 DGSE FSGT PLKYCRYRDYRD PPHS LVPYGYTLQFWHVLAARLAF 1 1 VFEHLVFC I KHL I S 423 

870 LLVPDIPESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELSSHW 920 

I : I I : I : : : : : I I I I : : : I I : I : : I : I 

424 YLIPDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGKAHHNEW 474 



RESULT 4 

US-09-270-767-45552 

; Sequence 4 5552, Application US/09270767 

; Patent No. 6703491 

; GENERAL INFORMATION: 

; APPLICANT: Homburger et al. 

; TITLE OF INVENTION: Nucleic acids and proteins of . Drosophila melanogaster 

; FILE REFERENCE: File Reference: 7326-094 

; CURRENT APPLICATION NUMBER: US/09/270,767 

; CURRENT FILING DATE: 1999-03-17 

; NUMBER OF SEQ ID NOS: 62517 

; SOFTWARE: Patentln Ver. 2.0 

; SEQ ID NO 45552 

LENGTH: 425 

TYPE: PRT 

ORGANISM: Drosophila melanogaster 
FEATURE: 

OTHER INFORMATION: Xaa means any amino acid 
US-09-270-767-45552 

Query Match 16.1%; Score 796; DB 2; Length 425; 

Best Local Similarity 41.1%; Pred.. No. 1.6e-77; 

Matches 171; Conservative 85; Mismatches 128; Indels 32; Gaps 9; 



Qy 44 3 RKSATLAYRWDCSDYEDTEERPRPQFAA— SAPMTAPNPITGEDEPYFP-ERSRARRML 4 98 

I I I : : I I I : : : I I I I I I : I I I : : I I I i I : : 

Db 1 RYSAEITHRWDLTGFDVHEEHPRPQYLARLEHIPPTRVDYVTNIKEPTVP FWRMKLPATV 60 

Qy 4 99 AGSWIVVTWA\AA^CLVSIILYRAIMAIWSRSGNTLI^WASRIASLTGSVVNLVFIL 558 

|:::::|: : |:::::|| I : : : : I :|: : : III : 

Db 61 FS FS WLLL I ALAFVALLAVWYRMSMLAALKVGAS PMTT SS AI VLAT AS AAFVNLCLLY 120 

Qy 559 ILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPG 618 

II: : I II I I II I I I I : I : I : IN::: I I I I : I : I II I I I I I : I I I : I I 

Db 121 ILNYMYNHLAEYLTELEMWRTQTQFDDSLTLKIYLLQFVNYYASIFYIAFFKGKFVGHPG 180 

Qy 619 NYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLI PKLKGWWQKFRLRSKK 678 

1:11 I I I I :: I I I I I I : I : I I I I I I I : I I : I : I : I 

Db 181 EYNKLFDYRQEECSSGGCLTELCIQLAI IMVGKQAFNTILEVYLPM FWRKV LA 233 

Qy 67 9 RKAGASAGASQGP WEDDYELVP — CEGLFDEYLEMVLQFGFVTI FVAACPL 727 

: I I : I I I : : I : I I I I I I I I I I I : I I I M I I I I I I 

Db 234 IQVGLSRLFNNT PNPDKTKDERWMRDFKLLDWGARGLFPEYLEMVLQYGFVTIFVAAFPL 293 

Qy 728 A P L F AL LN NWVE I R LD ARK F VC E Y RR P V AE RAQDIGIWFHILAGLTHLAVISNAFLLAFS 787 



Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 

Qy 

Db 
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II MINI :|:||||:| : ::|||::| II : 

Db 294 APFFALLNNILEMRLDAKKLLTHHKRPVSQRVRDIGVWYRILDCIGKLSVITNGFI I AFT 353 

Qy 788 SDFLPR-AYYRWTRAHDLRGFLNFTLAR APSSFAAAHN RTCRYRAFR 833 

II :|| : : I I : I I I I I : :|: :: I : Nil II 

Db 354 SDMI PRLVRHXVNKQGTLDGYLNFTLSEFKVIDSPTLYSLAGDLSNITTCRYTDFR 4 09 

RESULT 5 

US-09-270-767-61064 

Sequence 61064, Application US/09270767' 
Patent No. 6703491 
GENERAL INFORMATION: 
APPLICANT: Homburger et al. 

TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 
FILE REFERENCE: File Reference: 7326-094 
CURRENT APPLICATION NUMBER: US/09/270,767 
CURRENT FILING DATE: 1999-03-17 
NUMBER OF SEQ ID NOS : 62517 
SOFTWARE: Patentln Ver. 2.0 
SEQ ID NO 61064 
LENGTH: 215 
TYPE: PRT 

ORGANISM: Drosophila melanogaster 
US-09-270-767-61064 

Query Match 8.0%; Score 396.5; DB 2; Length 215; 

Best Local Similarity 4 3.2%; Pred. No. 2.4e-34; 

Matches 89; Conservative 36; Mismatches 54; Indels 27; Gaps 6; 

MVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGP WEDDYEL 698 

111111:11:1 : I : I : I I : I I I : : I 

MVGKQAFNT I LEVYLPM FWRKV LA I QVGLS RL FN NT PN PD KT KD E RWM RD FKL 53 

VP— CEGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVA 756 
: I I I I I I I I I I I : I I I I I I I I I I I I I I I I I I I : I : I I I I : I : : : I I I : 

LDWGARGLFPEYLEMVLQYGFVTI FVAAFPLAPFFALLNN ILEMRLDAKKLLTHHKRPVS 113 

ERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLAR — 814 
: I : I I I: I : I I : I : I I : I I : : I I : I I : I I : I I : I I I I I : 



I I I I II 



Qy 


648 


Db 


1 


Qy 


699 


Db 


54 


Qy 


757 


Db 


114 


Qy 


815 


Db 


174 



RESULT 6 

US-09-270-767-32253 

; Sequence 32253, Application US/09270767 
; Patent No. 6703491 
; GENERAL INFORMATION: 

APPLICANT: Homburger et al. 

TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 
; FILE REFERENCE: File Reference: 7326-094 
; CURRENT APPLICATION NUMBER: US/09/270,767 
; CURRENT FILING DATE: 1999-03-17 
; NUMBER OF SEQ ID NOS: 62517 
; SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 32253 
LENGTH: 366 
TYPE: PRT 

; ORGANISM: Drosophila melanogaster 
US-09-270-767-32253 

Query Match 7.1%; Score 353; DB 2; Length 366; 

Best Local Similarity 32.4%; Pred. No. 3.6e-29; 

Matches 95; Conservative 56; Mismatches 104; Indels 38; Gaps 9; 

Qy 108 R I AD FV LVWE EDLKLDRQQD S AAR D RT DMH RT - WR ET FL DN L RAAGL C VD — QQD VQD GN 164 

I Mil: : I : I : : I I I I I : I I I : I : I 
Db 93 ' RSIDFVLAYRIN AHEPTELENTEKRRVFEANLISQGLEVESSQKD 1 37 

Qy 165 TTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLGI PNVL LE 218 

: : : I II I I I I : I : : I : : I : I : : : : : I : 
Db 138 -QIWFVKIHAPLEVLRRYAEILKLRMPMKEI PGMSWNRSTKSVFSSLKHVFQFFLRNIY 196 
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Qy 219 WPDVPPEYYSCRFRV — NKLPRFLGSDNQDTFFTSTKRHQILFEIL — AKTPYGHEKKN 274 

I : : I : : I I : : : I Mill: I : I : II : I : : 
Db 197 VDEEIFPK-RAHRFTAI YSRDKEYLFDIRQDCFFTTAVRSRIVEFILDRQRFPAKNQHDM 255 

Qy 275 LLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDH 334 

I I : I : I I I I I I I : I I I I I I : I :: I I I I I : I I I I I 

Db 256 AFGI ERLI AEGVYSAAYPLHDGEITETG TMRALLYKHWASVPKWYRYQPLDD 307 

Qy 335 VRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGS 387 

:: I I I I : I I I I I I I : I I I I I : : I I : I I I : : : 1 : : : I I 
Db 308 IKEYFGVKIGLYFAWLGYYT YMLLLAS I VGVI CFLYSWFSLKNYVPVKDI CQS 360 



RESULT 7 

US-09-270-767-47470 

; Sequence 47470, Application US/09270767 

; Patent No. 67034 91 

; GENERAL INFORMATION: 

; APPLICANT: Homburger et al. 

TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 
; FILE REFERENCE: File Reference: 7326-094 
; CURRENT APPLICATION NUMBER: US/09/270,767 
; CURRENT FILING DATE: 1999-03-17 
; NUMBER OF SEQ ID NOS: 62517 
; SOFTWARE: Pa ten tin Ver. 2.0 
; SEQ ID NO 47470 

LENGTH: 366 

TYPE: PRT 

; ORGANISM: Drosophila melanogaster 
US-09-270-767-47470 

Query Match 7.1%; Score 353; DB 2; Length 366; 

Best Local Similarity 32.4%; Pred. No. 3.6e-29; 

Matches 95; Conservative 56; Mismatches 104; Indels 38; Gaps 9; 

Qy 108 R I AD FVL VWE E D L KL D RQQD S AARD RT DMH RT - WR ET FL DN L RAAGL CV D — QQ D VQDGN 164 

I MM: : I : I:: I I I M : M I: hi 
Db 93 RSIDFVLAYRIN — AHEPTELENTEKRRVFEANLISQGLEVESSQKD 137 

Qy 165 TTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVL LE 218 

: : : I II I I I I : I : : I : : I : I : : : : : I : 
Db 138 -QIWFVKIHAPLEVLRRYAEILKLRMPMKEIPGMSWNRSTKSVFSSLKHVFQFFLRNIY 196 

Qy 219 WPDVPPEYYSCRFRV— NKLPRFLGSDNQDTFFTSTKRHQILFEIL— AKTPYGHEKKN 274 

* I : : I : : I I :: : I Mill: I : I : II : I : : 
Db 197 VDEEIFPK-RAHRFTAIYSRDKEYLFDIRQDCFFTTAVRSRIVEFILDRQRFPAKNQHDM 255 

Qy 275 LLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDH 334 

I I : I : I I I I I I I : I I I 1 I I : I :: II I II : I I I I I 

Db 256 AFGI ERLI AEGVYSAAYPLHDGEITETG TMRALLYKHWASVPKWYRYQPLDD 307 

Qy 335 VRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGS 387 

:: I I I I : I I I I I II : I I II I :: M : M I : : : I : : : I I 
Db 308 IKEYFGVKIGLYFAWLGYYT YMLLLAS I VGVICFLYSWFSLKNYVPVKDI CQS 360 



RESULT 8 

US-O9-270-767-31816 

; Sequence 31816, Application US/09270767 

; Patent No. 6703491 

; GENERAL INFORMATION: 

; APPLICANT: Homburger et al. 

TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 
; FILE REFERENCE: File Reference: 7326-094 
'; CURRENT APPLICATION NUMBER: US/09/270, 767 
; CURRENT FILING DATE: 1999-03-17 
; NUMBER OF SEQ ID NOS: 62517 
; SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 31816 

LENGTH: 189 

TYPE: PRT 

; ORGANISM: Drosophila melanogaster 
US-09-270-767-31816 

Query Match 5.9%; Score 290; DB 2; Length 189; 
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Best Local Similarity 37.8%; Pred. No. 8.8e-23; 

Matches 62; Conservative 25; Mismatches 59; Indels 18; Gaps 4; 



Qy 24 9 FFTSTKRHQILFEILAKTPY — GHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQ 306 

I : : I : I : I I : : II : I I I : I : : I I : I : I I I 
Db 39 FL DA ST RY S 1 1 N F I LQRQRFVE GE ET ADN LG I EK L VQDGV YT C A YT L H D 87 

Qy 307 AP RL N Q RQ VL FQ HW ARWG KWN K YQ P L D H VRRY FG E KVAL Y FAWLG F YT GW L L P AAWGT L 366 

: I I : I I II I I I I :: I I I I I I I I I I I I I I I I I : I :| I I 
Db 88 KDDRDRLLKEWANI SKWKNLQPLDQI KDYFGAKVALYFAWLGFYTQML I P ISVFGVL 14 4 

Qy 367 VFLVGCFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSAC 4 09 

III I I : : : : | : I I I I I : I I : I 

Db 14 5 CFLYGFITWNSDPISRDICDDNGTI-MCPQCDRSCDYWRLNETC 187 



RESULT 9 

US-09-270-767-47033 

; Sequence 4 7033, Application US/09270767 

; Patent No. 6703491 

; GENERAL INFORMATION: 

; APPLICANT: Homburger et al. 

TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 
; FILE REFERENCE: File Reference: 7326-094 
; CURRENT APPLICATION NUMBER: US/09/270,767 
; CURRENT FILING DATE: 1999-03-17 
; NUMBER OF SEQ ID NOS: 62517 
; SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 47033 
LENGTH: 189 

TYPE: PRT * ' * 

ORGANISM: Drosophila melanogaster 
US-09-270-767-47033 

Query Match 5.9%; Score 290; DB 2; Length 189; 

Best Local Similarity 37.8%; Pred. No. 8.8e-23; 

Matches 62; Conservative 25; Mismatches 59; Indels 18; Gaps 4; 

Qy 24 9 FFTSTKRHQILFEILAKTPY— GHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQ 306 

I : : I : I : I I : : II : I I I : I : : I I : I : I I I 
Db 39 FLDASTRYSI INFILQRQRFVEGEETADNLGIEKLVQDGVYTCAYTLHD 87 

Qy 307 AP RL NQ RQ VL FQH WARWG KWN KYQ PL D H VR RY FG E KVAL Y FAWL G F YT GWL L PAAWGT L 366 

: I I : I I II t I I I :: I 1,1 I I I I I I I I I I I I I I : I :| i I 
Db 88 KDDRDRLLKEWAN I SKWKNLQPLDQI KDYFGAKVALYFAWLGFYTQML I PISVFGVL 14 4 

Qy 367 VFLVGCFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSAC 4 09 

III II :: :: ! : I I I I I : I I : I 

Db 14 5 CFLYGFITWNSDPI SRDICDDNGT I-MCPQCDRSCDYWRLNETC 187 



RESULT 10 

US-09-270-767-31722 

; Sequence 31722, Application US/09270767 

; Patent No. 6703491 

; GENERAL INFORMATION: 

; APPLICANT: Homburger et al. 

TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 
; FILE REFERENCE: File Reference: 7326-094 
; CURRENT APPLICATION NUMBER: US/09/270,767 
; CURRENT FILING DATE: 1999-03-17 
; NUMBER OF SEQ ID NOS: 62517 
; SOFTWARE: Patentln Ver. 2.0 
; SEQ ID NO 31722 
LENGTH: 199 
TYPE: PRT. 
; , ORGANISM: Drosophila melanogaster 
1 US-09-270-767-31722 

Query Match 5.2%; Score 255.5; DB 2; Length 199; 

Best Local Similarity 29.8%; Pred. No. 5.7e-19; 

Matches 59; Conservative 46; Mismatches 64; Indels 29; Gaps 5; 

Qy 417 LFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTA 4 76 

II: III: 11:111: 11:111 II I :M : : I MM: I I 
Db 11 LIDNNMTWFAFSMAIWAWYLEFWKRYSAGLVHRWGLTGFTHHVEHPRPQYLARISRT- 69 
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Qy 477 PNPITGEDEPYFPERSRARRMLAGSV VI WMVAVWMCLVSI ILY 521 

: I : I : : : I : I I : : I : : : I : : I I : I 

Db 70 -KKLAG — KAYEQDQTGKRTILDPDVPFWS FKFLPNFTSYSIMVLFICISVIAIAGI I IY 126 

Qy 522 RAIMAIWSRSGNTLLAAWAS RI AS LTGS WNLVF I L I LSK I YVS LAHVLT RWEM 576 

I : I I :| :::|: I :| :| :|| II :| 

Db 127 R MAQRASHSILGSENSMTFKVMILPMTAGIIDLIVISLLDMVYSNLAVKLTNYEY 181 

Qy 577 HRTQTKFEDAFTLKVFIF 594 

Mil::::: | : | : : | 
Db 182 CRTQTEYDESLTIKNYVF 199 



RESULT 11 

US-09-270-767-46939 

Sequence 46939, Application US/09270767 
Patent No. 6703491 
GENERAL INFORMATION : 
APPLICANT: Homburger et al. 

TITLE OF INVENTION: Nucleic acids and proteins of Drosophila melanogaster 
FILE REFERENCE: File Reference: 7326-094 
CURRENT APPLICATION NUMBER: US/09/270,7 67 
CURRENT FILING DATE: 1999-03-17 
NUMBER OF SEQ ID NOS: 62517 
SOFTWARE: Pa ten tin Ver. 2.0 
SEQ ID NO 46939 
LENGTH: 199 
TYPE: PRT 

ORGANISM: Drosophila melanogaster 
US-09-270-767-46939 

Query Match 5.2%; Score 255.5; DB 2; Length 199; 

Best Local Similarity 29.8%; Pred. No. 5.7e-19; 

Matches 59; Conservative 46; Mismatches 64; Indels 29; Gaps 5; 

LFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTA 4 76 
II: III: I I : I I I : I I : I I I I I I : I I : : I Mil: I I 



Qy 


417 


Db 


11 


Qy 


477 


Db 


70 


Qy 


522 


Db 


127 


Qy 


577 


Db 


182 



I : I I ::|: : : |: : 1|:| 



: I : :: : I : I I : I :: : I : I : I :| : I I I I : I 

-MAQRASHSILGSENSMTFKVMILPMTAGIIDLIVISLLDMVYSNLAVKLTNYEY 181 



||||::::: | : | : : | 



RESULT 12 

US-09-621-976-4064 

Sequence 4064, Application US/09621976 
Patent No. 6639063 
GENERAL INFORMATION: 
APPLICANT: Dumas Milne Edwards, J.B. 
APPLICANT: Jobert, S. 
APPLICANT: Giordano, J.Y. 

TITLE OF INVENTION: ESTs and Encoded Human Proteins. 
FILE REFERENCE: GENSET . 054 PR2 
CURRENT APPLICATION NUMBER: US/09/621,976 
CURRENT FILING DATE: 2000-07-21 
NUMBER OF SEQ ID NOS: 19335 
SOFTWARE : Patent.pm 
SEQ ID NO 4064 
LENGTH: 166 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE: 

NAME/KEY: SIGNAL 
LOCATION: -58.. -1 
US-09-621-976-4064 
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Query Match 3.8%; Score 186.5; DB 2; Length 166; 

Best Local Similarity 28.0%; Pred. No. 1.5e-ll; 

Matches 4 5; Conservative 34; Mismatches 75; Indels 7; Gaps 3; 

Qy 448 LAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWM 507 

: III : I I I I I : I I I I : : I I : I I I : I : I : 

Db 1 MTYRWGTLLMKRKFEEPRPGFHG— VLGINSITGKEEPLYPSYKRQLRIYLVSLPFVCL 57 

Qy 508 VAVWMCLVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVSL 567 

: :: I : |: : : : I I : |:: : I |::::| 

Db 58 CLYFSLYVMMIYFDMEVWALGLHENSG SEWTS-VLLYVPSI IYAIVIEIMNRLYRYA 113 

Qy 568 AHVLTRWEMHRTQTKFEDAFTLKVFI FQFVNFYSSPVYIAF 608 

I I I I I I I :: ::: I I I : I |:| :: I I I I I 
Db 114 AEFLTSWENHRLESAYQNHLILKVLVFNFLNCFASLFYIAF 154 



'RESULT 13 
US-08-676-279-50 

; Sequence 50, Application US/08676279 
; Patent No. 5869247 
; GENERAL INFORMATION : 
APPLICANT: 

TITLE OF INVENTION: MACROPHAGE NUCLEOTIDE SEQUENCE 
NUMBER OF SEQUENCES: 63 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.30 (EPO) 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/676,279 

FILING DATE: 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: PCT/GB95/00095 

APPLICATION NUMBER: GB 9400929.7 

FILING DATE: 19-JAN-1994 
PRIOR APPLICATION DATA: 

APPLICATION NUMBER: GB 9422021.7 

FILING DATE: 31-0CT-1994 
; INFORMATION FOR SEQ ID NO: 50: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 548 amino acids 
; TYPE: amino acid 

STRANDEDNESS: single 

TOPOLOGY: linear 
US-08-676-279-50 



Query Match 2.4%; Score 117; DB 1; Length 548; 

Best Local Similarity 20.2%; Pred. No. 0.0048; 

Matches 136; Conservative 89; Mismatches 207; Indels 240; Gaps 34; 



Qy 


67 


VLIDVSPPEAEKRGSYGSTAHASEPG-GQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQ 
:: 1 1 1 1 1 1 I 1 1 : 1 1 1 I 1 II II 


125 


Db 


1 


MISDKSPPRL-SRPSYGSI — SSLPGPAPQPAPCR ETYLSEKIP 


41 


Qy 


126 


QD S AARDRT DMH RT WRET - - - FLD — NLRAAGLCVDQQDVQDGNTTVHYALLSA 

It: : : 1 1 Mil: : hi 1 1 
I PSADQGT FSLRKLWAFTGPGFLMS I AFLDPGN I ESDLQAGAVAGFKLLWVL 


174 


Db 


42 


93' 


Qy 


175 


SWA VLC YYAEDLRLKL PLQELPNQ 


198 


Db 


94 


II : 1 1 | | : { : | : | | : 
LWATVLGLLCQRLAARLGWTGKDLGEVCHLYYPKVPRILLWLTIELAIVGSDMQEVIGT 


153 


Qy 


199 


ASNW SAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKR 

1 : : III: 1 1 1 : : 1 1 : : : 1 1 1 .1 
AISFNLLSAGRIPLWG — GVLITIV-DTFFFLFLDNYGLRKLEAFFG 


255 


Db 


154 


197 


Qy 


256 


HQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQV 
: 1 1 : 1 1 : 1 : 1 : 1 : : 1 1 1 1 MM: : 
— LLITIMALT-FGYE YWAHP — SQGALLKGLVLPTCPGCGQPELLQAVGIVGAI I 


315 


Db 


198 


249 


Qy 


316 


LFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFL- 
: : 1 : : 1 1 1 f : II 1 : 1 : : : 1 : : 1 : 
MPHNIYLHSALVKSREVDRTRRVDVREANMYF LIEATIALSVSFIINLFVM 


374 


Db 


250 


300 
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I 1 



Qy 375 -VFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFD HGG 422 

II I: ::::| :| : :| ::| || 

■Db 301 AVFGQAFYQQT — NEEAFNIC AN S S LQNYAK I FPRDNNT VS VD I YQGG 34 6 

Qy 423 TVFFSLF MALWAVLLLEYWKRKSATLAY RWDCSDYEDTEERPRP 4 66 

: II : : I I I I I : : I I II 
Db 347 VILGCLFGPAALYIWAVGLLAAGQSSTMTGTYAGQFVMEGFLKLRW 392 

Qy 4 67 QFAASAPMTAPNPITGEDEPYFPERSR-ARRMLAGSWIV — VMVAV V 511 

I I I I : I I I: I : I I I : 
Db ' 393 SRFARVLLTRSCAILPTVLVAVFRDLKDLSGLNDL 427 

Qy 512 VMCLVSI ILYRAIMAI WSRSGNTLLAAWAS-RIASLTGS WNLVFILILSKI 563 

: I I : : I I : : I : I : : : I : I : : I : I I I : : 
Db 428 LN VLQS LL L P FAVL P I LT FT SM PAVMQE FANGRMS KAI T S C I MALVCA INLYFVI SY 484 

Qy 564 YVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAF FKGRFVGYPG 618 

III | : | : | : : | : I : j : : 

Db 485 LPSLPH PAYFGLVALFA-IGYLGLTAYLAWTCCIAHGATFLTHSS 528 

Qy 619 NYHTLFGVRNEE 630 

: I CI: III 
Db 529 HKHFLYGLPNEE 540 



RESULT 14 
US-08-903-139B-8 

; Sequence 8, Application US/08903139B 

; Patent No. 6114118 

; GENERAL INFORMATION: 

APPLICANT: Joe W. Temple ton, Jianwei Feng, L. Garry Adams, 

APPLICANT: Erwin Schurr, Philippe Gros, Donald S. Davis and Roger Smith 

TITLE OF INVENTION: METHOD OF IDENTIFICATION OF ANIMALS 

TITLE OF INVENTION: RESISTANT. OR SUSCEPTIBLE TO DISEASES SUCH AS RUMINANT 

TITLE OF INVENTION: BRUCELLOSIS, TUBERCULOSIS, PARATUBERCULOSIS AND SALMONELLOSIS 

NUMBER OF SEQUENCES: 31 

CORRESPONDENCE ADDRESS: 

ADDRESSEE: Pravel, Hewitt, .Kimball & Krieger 

STREET: 1177 West Loop South, 10th Floor 

CITY: Houston 

STATE: TX 

COUNTRY: USA 

ZIP: 77027-9095 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/ 903, 139B 

FILING DATE: 

CLASSIFICATION: 435 
PRIOR APPLICATION DATA: ■ 

APPLICATION NUMBER: 60/031,443 

FILING DATE: September 20, 1996 
ATTORNEY/AGENT INFORMATION: 

NAME: Krieger, Paul E. 

REGISTRATION NUMBER: 25,88 6 

REFERENCE/DOCKET NUMBER: 001 62-3/V96171US 
TELECOMMUNICATION INFORMATION: 

TELEPHONE: 713-850-0909 

TELEFAX: 713-850-0165 
; INFORMATION FOR SEQ ID NO: 8: 
SEQUENCE CHARACTERISTICS: 

LENGTH: 54 8 amino acids 

TYPE: amino acid 

TOPOLOGY: linear 
MOLECULE TYPE: peptide 
US-08-903-139B-8 

Query Match 2.4%; Score 117; DB 2; Length 548; 

Best Local Similarity 20.2%; Pred. No. 0.0048; 

Matches 136; Conservative 89; Mismatches 207; Indels 240; Gaps 34; 

Qy 67 VLIDVSPPEAEKRGSYGSTAHASEPG-GQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQ 125 

:: I I I I I I I I I : I I I I I I I II 
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1 1 



Db 


1 




41 


Qy 


126 


QDSAARDRTDMHRTWRET FLD — NLRAAGLCVDQQDVQDGNTTVHYALLSA 


174 


Db 


42 


II: : : 1 1 III I: : 1 : 1 1 1 
I PSADQGTFSLRKLWAFTGPGFLMS I AFLDPGN I ESDLQAGAVAGFKLLWVL 


93 


Qy 


175 


SWA- VLC YYAEDLRLKL PLQELPNQ 

II : 1 | 1 1 : 1 : 1 . : 1 1 : 
LWATVLGLLCQRIJU^LGVVTGKDLGEVCHLYYPKVPRILLWLTIELAIVGSDMQEVIGT 


198 


Db 


94 


153 


Qy 


199 


ASNW SAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKR 

1 III : 1 II: :| 1 : : : II II 
AI SFNLLSAGRI PLWG — GVLITIV-DTFFFLFLDNYGLRKLEAFFG 


255 


Db 


154 


197 


Qy 


256 


HQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQV 
:| |:| I :|:| : I ::| 1 1 1 II II : : 


315 


Db 


198 


— LLITIMALT-FGYE YWAHP — SQGALLKGLVLPTCPGCGQPELLQAVGIVGAI I 


249 


Ov 


316 


L FQH WARWGKWN KY QP L D H V RR Y FGE KV AL Y F AW LG FYT GWL L P AA WGT L V FL VGC F L - 
: : 1 - r =1 M : : ! 1 | : | : : : | : : | : 
MP HN I Y LH S ALVKS RE VD RT RRVD VREANM Y F LI EAT IALSVSFIINL FVM 


374 


Db 


250 


300 


Ov 


37 5 




422 


Db 


301 


II | : : : : : | : | : : I : : I II 
AVFGQAFYQQT — NEEAFNIC ANSSLQNYAKIFPRDNNTVSVDIYQGG 


346 


Ov 


423 


TVFFSLF MALWAVLLLEYWKRKSATLAY RWDCSDYEDTEERPRP 


466 


Db 


347 


: 1 1 : : 1 1 1 1 1 : : 1 1 II 

V X JjU^JjL \J t lYAJJ -L -L »V1 V UJjJUArVUy*J«J X I I -1 v_J X ± xk\J\£L V 1 lUU 1- J-J IVU IAI1 


392 


Ov 


467 


QFAASAPMTAPN P I TGEDEPYFPERS R- ARRMLAGS WI V — VMVAV V 


511 


Db 


393 


1 1 1 1 : 1 1 1 : 1 : 1 1 1 : 


427 


Ov 


512 


VMCLVSI ILYRAIMAIWSRSGNTLLAAWAS-RIASLTGS WNLVFILILSKI 


563 


Db 


428 


: 1 1 : : i 1 1 : 1 : : : 1 : 1 : : 1 : 1 1 1 :: 
LNVLQSLLLPFAVLPILTFTSMPAVMQEFANGRMSKAITSCIMALVCAINLYFVI SY 


484 


Ov 


564 ' 


YVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAF FKGRFVGYPG 

Mi I: 1 :| : : |:|: I: : 
LPSLPH PAYFGLVALFA-IGYLGLTAYLAWTCCIAHGATFLTHSS 


618 


Db 


485 


528 


Qy 


619 


NYHTLFGVRNEE 630 
: 1 1 : 1 : 1 1 1 
HKHFLYGLPNEE 540 




Db 


529 





RESULT 15 
US-08-637-823B-25 

; Sequence 25, Application US/08637823B 
; Patent No. 6184031 
; GENERAL INFORMATION: 

APPLICANT: Gros, Philippe 

APPLICANT : Skamene, Emil 

TITLE OF INVENTION: DNA SEQUENCES THAT ENCODE A NATURAL 
•; TITLE OF INVENTION: RESISTANCE TO INFECTION WITH INTRACELLULAR PARASITES 

NUMBER OF SEQUENCES: 32 
; CORRESPONDENCE ADDRESS: 

ADDRESSEE: KLAUBER & JACKSON 

STREET: 411 Hackensack Ave 

CITY: Hackensack 

STATE: New Jersey 

COUNTRY: U.S.A. 

ZIP : 07601 
COMPUTER READABLE FORM: 

MEDIUM TYPE: Floppy disk 

COMPUTER: IBM PC compatible 

OPERATING SYSTEM: PC-DOS/MS-DOS 

SOFTWARE: Patentln Release #1.0, Version #1.25 
CURRENT APPLICATION DATA: 

APPLICATION NUMBER: US/08/ 637 , 823B 

FILING DATE: 05/08/96 

CLASSIFICATION: 4 35 
ATTORNEY/AGENT INFORMATION: 

NAME: Jackson, David A. 

REGISTRATION NUMBER: 26,74 2 
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TELECOMMUNICATION INFORMATION: 
TELEPHONE: 201 487 5800 
TELEFAX: 201 343 1684 
TELEX : 

; INFORMATION FOR SEQ ID NO: 25: 
SEQUENCE CHARACTERISTICS: 
LENGTH: 54 8 amino acids 
TYPE: amino acid 
TOPOLOGY: linear 
MOLECULE TYPE: protein 
US-08-637-823B-25 



Query Match 2.4%; Score 117; DB 2; Length 54 8; 

Best Local Similarity 20.2%; Pred. No. 0.0048; 

Matches 136; Conservative 89; Mismatches 207; Indels 240; Gaps 34; 



Qy 


67 


VLIDVSPPEAEKRGSYGSTAHASEPG-GQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQ 
: : 1 1 1 1 1 1 1 1 1 :l II 1 1 I 1 II 


125 


Db 


1 


41 


Qy 


126 


QDSAARDRTDMHRTWRET FLD — NLRAAGLCVDQQDVQDGNTTVHYALLSA 

II: : : 1 1 Mil: : 1 : 1 1 1 
I PSADQGT FSLRKLWAFTGPGFLMS I AFLDPGN I ESDLQAGAVAGFKLLWVL 


174 


Db 


42 


93 


Ov 


175 


SWA VLC YYAEDLRLKL PLQELPNQ 


198 


Db 


94 


II : 1 1 1 1 : 1 : 1 : 1 1 : 
LWATVLGLLCQRLAARLGWTGKDLGEVCHLYYPKVPRILLWLTIELAIVGS DMQEVIGT 


153 


Ov 


199 


ASNW SAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKR 

1 :: III: 1 1 1 : :| 1 : : : 1 1 II 
AISFNLLSAGRI PLWG — GVLITI V-DTFFFLFLDNYGLRKLEAFFG 


255 


Db 


154 


197 


Ov 


256 


HQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQV 
: 1 1 : 1 1 : 1 : 1 : 1 : : 1 1 1 1 II II : : 
— LLITIMALT-FGYE YWAHP — SQGALLKGLVLPTCPGCGQPELLQAVGIVGAI I 


315 


Db 


198. 


24 9 


Ov 


316 


L FQH WARWGKWN KYQP LD HVRR YFGE KVAL YFAWLGFYTGWL LPAAWGT LV FL VGC FL- 
: : 1 : :| II : :|| |: 1 : .:: I:: |: 
MP HN I Y LH SALVKS RE VD RT RR VD VREANMY F LI EAT I AL S VS F 1 1 NL FVM 


374 


Db 


250 


300 


Ov 


375 


-VFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFD HGG 

II 1: :| : :| ::| II 
AVFGQAFYQQT — NEEAFNIC ANSSLQNYAKIFPRDNNTVSVDIYQGG 


422 


Db 


301 


346 


Qy 


423 


TVFFSLF MALWAVL L L E YW KR KS AT LAY RWDCSDYEDTEERPRP 

: ! 1 : : 1 i 1 1 1 : : 1 1 II 
VILGCLFGPAALYIWAVGLLAAGQSSTMTGTYAGQFVMEGFLKLRW 


466 


Db 


34 7 


392 


Qy 


467 


QFAASAPMTAPNPITGEDEPYFPERSR-ARRMLAGSWIV — VMVAV V 

1 1 1 1 : 1 1 1 : 1 : i 1 1 : 
SRFARVLLTRSCAILPTVLVAVFRDLKDLSGLNDL 


511 


Db 


393 


427 


Qy 


512 


VMCLVSIILYRAIMAIWSRSGNTLLAAWAS-RIASLTGS WNLVFILILSKI 

: 1 1 : : I 1 : : 1 : 1 : : : 1 : 1 : : 1 : 1 1 1 : : ' 
LNVLQSLLLPFAVLPILTFTSMPAVMQEFANGRMSKAITSCIMALVCAINLYFVI SY 


563 


Db 


428 


484 


Qy- 


564 


YVSLAHVLTRWEMHRTQTKFEDAFTLKVFI FQFVNFYSSPVYIAF — FKGRFVGYPG 


618 


Db 


485 


III | : | : | : : | : | : | : : 
LPSLPH PAYFGLVALFA-IGYLGLTAYLAWTCCIAHGATFLTHSS 


528 


Qy 


619 


NYHTLFGVRNEE 630 
: 1 1 : 1 : 1 1 1 
HKHFLYGLPNEE 540 




Db 


52 9 





Search completed: October 27, 2006, 20:30:36 
Job time : 57 sees 



SCORE 1.3 BuildDate: 12/06/2005 



http://es/ScoreAccessWeb/GetItem.action?AppId=10552515&seqId=775626&Item 11/17/2006 



SCORE Search Results Details for Application 
10552515 and Search Result us-10-552-515- 
l.rapbm, 



Score Home Retrieve Application SCORE System SCORE Comments / 

Page List Overview FAQ Suggestions 

This page gives you Search Results detail for the Application 10552515 and Search Result us-10- 

552-515-1. rapbm. 

start 

Go Back to previous page 

GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Biocceleration Ltd. 

OM protein - protein search, using sw model 

Run on: October 27, 2006, 20:29:57 ; Search time 189 Seconds 

(without alignments) 
2286.664 Million cell updates/sec 

Title: US-10-552-515-1 
Perfect score: 4950 

Sequence: 1 MRMAATAWAGLQGPPLPTLC SELSSHWTPFTVPKASQLQQ 933 

Scoring table: BLOSUM62 

Gapop 10.0 , Gapext 0.5 

Searched: 2097797 seqs, 463214858 residues 

Total number of hits satisfying chosen parameters: 2097797 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 4 5 summaries 

Database : Published_Applica tions_AA_Main : 



/EMC_Celerra_SIDS3/ptodata/2/pubpaa/US07_PUBCOMB.pep:* 

/EMC_Celerra_SIDS3/ptodata/2/pubpaa/US08_PUBCOMB.pep:* 

/EMC_Celerra_SIDS3/ptodata/.2/pubpaa/US09_PUBCOMB.pep:* 

/EMC_Celerra_SIDS3/ptodata/2/pubpaa/US10A_PUBCOMB.pep:* 

/EMC_Celerra_SIDS3/ptodata/2/pubpaa/US10B_PUBCOMB.pep:* 

/EMC_Celerra_SIDS3/ptodata/2/pubpaa/USll_PUBCOMB.pep:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 

SUMMARIES 

% 

Result Query 

No. Score Match Length DB ID Description 
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ALIGNMENTS 



RESULT 1 

US-10-450-763-45847 

; Sequence 4 5847, Application US/10450763 
; Publication No. US200501 96754A1 
; GENERAL INFORMATION: 
; APPLICANT: Hyseq, Inc 

; TITLE OF INVENTION: NOVEL NUCLEIC ACIDS AND POLYPEPTIDES 

; FILE REFERENCE: 790CIP3/US 

; CURRENT APPLICATION NUMBER: US/ 10/4 50, 7 63 

; CURRENT FILING DATE: 2003-06-11 

; PRIOR APPLICATION NUMBER: PCT/US01/08631 

; PRIOR FILING DATE: 2001-03-30 

; PRIOR APPLICATION NUMBER: 09/540,217 

; PRIOR FILING DATE: 2000-03-31 

; PRIOR APPLICATION NUMBER: 09/64 9,167 

; PRIOR FILING DATE: 2000-08-23 

; NUMBER OF SEQ ID NOS : 60736 

; SOFTWARE: Custom 

; SEQ ID NO 45847 

LENGTH: 898 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-450-763-45847 

Query Match 75.5%; Score 37 36; DB 5; Length 898; 

Best Local Similarity 82.3%; Pred. No. 0; 

Matches 727; Conservative 4; Mismatches 16; Indels 136; Gaps 6; 

-Qy 1 MRMAAT AW AGLQGP P L PT LC P AVRT GL Y C RDQ AH AE R 37 

I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I ! I I II I 
Db 1 • MRMAAT AW AGLQGP PL PT LC PAVRTGLYCRDQAH AERATDWLLAP FCQPKT RS HGTC PP 60 

Qy 3 8 W— AMTSETS SG 4 7 

I I : I I I 
Db 61 TERDPRGEGSTEYPGRVDGIQGWGTRALTGWTDRRLLCQACQTLPPRHWFLPGARGWLGG 120 

Qy 4 8 SHCA RSRMLRRRAQEEDSTVL I DVS P PEAEKRGS YGST AH 87 
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A 1 

1 1 i : 1 i 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 i 1 1 1 i 1 1 1 1 1 i 1 1 1 




Db 


121 


SPCAHGQESLPSQPSPILLRVESVKSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYGSTAH 


180 


Ov 
vy 


88 


AS E PGGQQAAAC RAGS P A K P R I AD FVL VWE ED L KL D RQQD S AAR DRT D MH RT WRET FL DN 


147 






1 1 1 1 I 1 1 1 I 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! ! ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


181 


AS E PGGQQAAAC RAGS PAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRETFLDN 


240 


Ov 
vy 


148 


LRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLL 


207 






1 1 1 1 1 1 1 1 1 1 t 1 1 1 1 1 1 1 1 1 i 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : t - 
i i i i i i i i i i I i i i i i i i i i i i i i i ) i i i i i i i i i [ i 1 i i i i i i i i • i • 




Db 


241 


LRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQDYPTRPPTGRPACC 


300 


Ov 
vy 


208 


AWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTP 


267 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 I 1 1 I 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


AWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTP 


360 


Ov 
vy 


268 


YGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWN 


327 






1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 ] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 i 1 1 
i i i i i i t i i i i i i i i i i i i i i i i i i 1 i i i i i i i i i i i i i i i 1 i i i i i i i i i i i i i i i i i i 




Db 


361 


YGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWN 


420 


Ov 
vy 


328 


KYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGS 


387 






1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i t t i i i i i j i i i i i i i i i i 




Db 


421 


KYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGS 


480 


Ov 
vy 


388 


KDSFEMCPLCLDCPFWLLSSACALAQ AGRLFDHGGTVFFSLFMALWAVLLLEYWKR 


4 43 






1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 ! 1 1 1 

1 I 1 1 I 1 1 1 1 1 ( 1 1 1 I 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 l 1 1 1 1 1 ! 1 1 ! 




Db 


481 


KDSFEMCPLCLDCPFWLLSSACALAQVREEAGRLFDHGGTVFFSLFMALWAVLLLEYWKR 


540 


Ov 


444 


KSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSW 


503 






1 1 t I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 I 1 i 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 

i i i i i i i i i i i i r i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 




Db 


541 


KSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSW 


600 


Ov 
vy 


504 


IWMVAVWMCLVS 1 1 LYRAIMAI WSRSGNTLLAAWASRIASLTGS WNLVFILILSKT 


563 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 M 1 1 1 1 1 1 1 1 




Db 


601 


IWMVAVWMCLVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKI 


660 


Ov 
vy 


564 


YV S L AH VLT RWEMH RT QT KF E D AFT L KV F I FQ FVN F Y S S P VY I A F FKG RFVG Y P GN Y H T L 


623 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 




Db 


661 


YVSLAHVLTRWEMHRTQTKFEDAFTLKVFI FQFVNFYSSPVYIAFFKGRFVGYPGNYHTL 


720 


Ov 


624 


FGVRN E EC AAGGCL I ELAQELLVIMVGKQVINNMQEVL I PKLKGWWQKFRLRSKKRKAGA 


683 






1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 
i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i i 




Db 


721 


FGVRN E EC AAGGCL I ELAQE LL VI MVGKQV I NNMQE VL I P KLKGWWQK FRLRSKKRKAGA 


780 


Ov 


684 




711 






1 1 1 1 1 1 1 1 1 1 1 1 ! 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


781 


SAGASQGPWEDDYELVPCEGLFDEYLEMGAGFCPNACPELVPELTEPEKARDQPEARSAG 


840 


Ov 
vy 


712 










1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 
i i i i i i i i t i i i i i i i i i i i i i i i i i i i i i i i i i i i 




Db 


841 


QDSRPEAVLQFGFVT I FVAACPLAPLFALLNNWVEI RLDARKF 883 





RESULT 2 

US-10-104-047-2574 

; Sequence 2574, Application US/10104047 
; Publication 'No. US20030236392A1 
; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: No. US20030236392Alel full length cDNA 
; FILE REFERENCE: HI -AO 105 

; CURRENT APPLICATION NUMBER: US/10/104 , 047 

; CURRENT FILING DATE: 2002-03-25 

; PRIOR APPLICATION NUMBER: 

; PRIOR FILING DATE: 

; NUMBER OF SEQ ID NOS: 4 096 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 2574 

LENGTH: 920 

TYPE: PRT 
; ORGANISM: Homo sapiens 
US-10-104-047-2574 

Query Match 30.9%; Score 1531.5; DB 4; Length 920; 

Best Local Similarity 37.9%; Pred. No. 9.1e-139; 

Matches 360; Conservative 168; Mismatches 316; Indels 105; Gaps 29; 
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1 A 



Qy 


44 


TSSGSHCARSRMLRRRAQEEDSTVLID VSPPEAE KRGS YGST AH AS EP 

: 1 1 1 :: :: | : I : I I lit: | : | j | | 
SSSGITNGKTKVFHPVA — KDVNILFDELEAVSSPCKDDDSLLHPGNLTSTSDDASRLEA 


91 


Db 


4 


61 


Qy 


92 


GGQQ AAAC RAGS PAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET FLD 

II: : 1 M |::||: : : :|: III 
GGET VPERNKSNGL YFRDGKCRI - DY I LVYRK SNPQTEK REVFER 


146 


Db 


62 


105 


Qy 


147 


NLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQE LPNQASNW 

1 : 1 1 1 1 :: : : 1 : : : 1 1 1 1 1 1 1 1 : : :: 1 : II: 
NIRAEGLQMEKESSLI-NSDI IFVKLHAPWEVLGRYAEQMNVRMPFRRKIYYLPRRYKFM 


202 


Db 


106 


164 


Qy 


203 


S AGLLAWLGIPNVLL — EWPDVPP-EYYSCRFRVNKLPRFLGSDNQDTFFTST 

1 : 1 II : 1 1 : 1 1 : : 1 : 1 :: 1 : 1 :: 1 1 1 : 
SRIDKQISRLRRWLPKKPMRLDKETLPDLEENDCYTAPFSQQRIHHFI-IHNKETFFNNA 


253 


Db 


165 


223 


Qy 


254 


KRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQR 
1 : 1 : ||: I I 1 1 :| :: :| 1 . 1 I I 1 I | I : | :: : I I 
TRSRIVHHILQRIKY-EEGKNKIGLNRLLTNGSYEAAFPLHEGSYRSKNSIRTHGAENHR 


313 


Db 


224 


282 


Qy 

Db 


314 
283 


QVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCF 

: 1 : : i 1 II 1 Mill 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 I : I I I I I I 1 : 1 III 1 
HLLYECWASWGVWYKYQPLDLVRRYFGEKIGLYFAWLGWYTGMLFPAAFIGLFVFLYGVT 


373 
342 


Qy 


374 


LVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMAL 
: : : | : | : | I I I : I III I I : I | : 1 1 1 : 1 1 1 1 1 : : 1 1 1 : 
TLDHSQVSKEVCQATDI I -MCPVCDKYCPFMRLSDSCVYAKVTHLFDNGATVFFAVFMAV 


432 


Db 


343 


401 


Ov 


433 


WAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAAS-APMTAPNPITGEDEPYFPER 
1 1 : 1 1 : 1 1 1 : 1 : 1 1 1 1 1 : 1 : 1 1 1 1 1 1 1 : II 1 : 1 : 1 1 1 
WATV FL E FWKRRRAVI AYDWDL I DWE EEEEE I RPQFEAKY S KKE RMN P I SGK PE P YQA FT 


491 


Db 


402 


461 


Qy 


492 


SRARRMLAGSWIWMVAVWMCLVSIILYRAIMAIWSRSGNTLLA-AWA SRIA 

: h: : 1 I: II: : |::|| : :| 1 II |::| 
DKCSRLIVSASGIFFMICWIAAVFGIVIYRWTV ST FAAFKWAL I RNN SQVA 


545 


Db 


462 


514 


Qy 


54 6 


SLTGSW— NLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSP 
: 1 1 : 1 1 1 : : 1 : : 1 : 1 : 1 1 1 1 1 : : : : 1 : : 1 1 1 1 : 1 : 1 1 1 1 1 II 
T -TGTAVC I N FC I I MLLN VL YE KVAL LLTN LEQP RT E S EWEN S FT LKM FL FQ FVN LN S ST 


603 


Db 


515 


573 


Qy 

Db 


604 
574 


VYIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLI 

111111111:111 i 1 1 1 1 1 1 1 1 : 1 :: : II 1 1 1 II I : 
FYIAFFLGRFTGHPGAYLRLINRWRLEECHPSGCLIDLCMQMGIIMVLKQTWNNFMELGY 


662 
633 


Qy 


663 


PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTI 
1 :: 1 1 : 1 ::: 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 
PLIQNWWTR RKVRQEHGPERKI S FPQWEKDYNLQPMNAYGLFDEYLEMI LQFGFTT I 


720 


Db 


634 


690 


Qy 

Db 


721 
691 


FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISN 
1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 II III :: 1 i 1 : 1 1 |: 1 1 1 1 1 : 11 |: 1 : 1 1 : 1 
FVAAFPLAPLLALLNN I I EI RLDAYKFVTQWRRPLASRAKDIGIWYGI LEGI GI LSVITN 


780 
750 


Qy 


781 


AFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA RAP 

1 1 : : 1 : 1 1 1 : 1 1 1 : : 1 : : 1 : 1 : 

AFVIAITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLSVFRISDFENRSEPESDG 


816 


Db 


751 


810 


Qy 


817 


SSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAI RLAFVIVFEHWFSVGRLLDLL 

II: : 1 1 1 i : 1 1 1 : : 1 : : 1 1 1 1 1 1 : 1 1 1 1 1 : 1 1 : 1 : 1 
SEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQFWHVLAARLAFIIVFEHLVFCIKHLISYL 


871 


Db 


811 


870 


Qy 


872 


VPDIPESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELSSHW 920 
:||:|: : : : : 1 1 1 1 : : : 1 1 : |: : 1 : I 
IPDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGKAHHNEW 919 




Db" 


871 





RESULT 3 

US-11-072-512-2574 

Sequence 2574, Application US/11072512 
Publication No. US2006002994 5A1 
GENERAL INFORMATION: 
APPLICANT: ISOGAI, TAKAO 
APPLICANT: SUGIYAMA, TOMOYASU 
APPLICANT: OTSUKI, TETSUJI 
APPLICANT: WAKAMATSU, AI 
APPLICANT: SATO, HIROYUKI 
APPLICANT: ISHII, SHIZUKO 
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A A 



APPLICANT: YAMAMOTO, JUN-ICHI 
APPLICANT: ISONO, YUUKO 
APPLICANT: HIO, YURI 
APPLICANT: OTSUKA, KAORU 
APPLICANT: NAGAI, KEIICHI 
APPLICANT: IRIE, RYOTARO 
APPLICANT: TAMECHIKA, ICHIRO 
.APPLICANT: SEKI, NAOHIKO 
APPLICANT: YOSHIKAWA, TSUTOMU ■ 
APPLICANT: OTSUKA, MOTOYUKI 
APPLICANT: NAGAHARI, KENJI 
APPLICANT: MASUHO, YASUHIKO 
TITLE OF INVENTION: Novel full length cDNA 
FILE REFERENCE: 084335-0191 
CURRENT APPLICATION NUMBER: US/ 1 1/072 , 512 
CURRENT FILING DATE: 2005-03-07 
PRIOR APPLICATION NUMBER: US 60/350,978 
PRIOR FILING DATE: 2002-01-25 
PRIOR APPLICATION NUMBER: JP 2001-379298* 
PRIOR FILING DATE: 2001-11-05 
NUMBER OF SEQ ID NOS: 4096 
SOFTWARE: Pa ten tin Ver. 2.1 
SEQ ID NO 2574 
LENGTH: 920 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-11-072-512-2574 

Query Match 30.9%; Score 1531.5; DB 6; Length 920; 

Best Local Similarity 37.9%; Pred. No. 9.1e-139; 

Matches 360; Conservative 168; Mismatches 316; Indels 105; Gaps 29; 

TSSGSHCARSRMLRRRAQEEDSTVLI D VSPPEAE KRGSYGST AHASEP 91 

: I I I : : :: I ': I : I I III: I : I I i I 

SSSGITNGKTKVFHPVA — KDVNILFDELEAVSSPCKDDDSLLHPGNLTSTSDDASRLEA 61 

GGQQ AAAC RAGS PAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRETFLD 14 6 

II: : | | | | : : | | : : : : | : I I I 

GGETVPERNKSNGLYFRDGKCRI-DYILVYRK SNPQTEK REVFER 105 

N L RAAGLC VD QQ D VQD GN TT VH Y ALL S AS WAV LC Y Y AE D L RL KL P L QE LPNQASNW 202 

I : I I I I : : : : I : : : I I I I I I I I : : : : I : I I : 





4 4 


UD 




Ov 


92 


Db 


62 


Qy 


147 


Db 


106 


Qy 


203 


Db . 


165 


Qy 


254 


Db 


224 


Qy 


314 


Db 


283 


Qy 


374 


Db 


343 


Qy 


433 


Db 


402 


Qy 


492 


Db 


462 


Qy 


546 


Db 


515 


Qy 


604 


Db 


574 



-AGLLAWLGIPNVLL — EWPDVPP-EYYSCRFRVNKLPRFLGS DNQDTFFTST 253 
: I II : I l' : I I : : I : I :: I : I : : I I I : 



I : I : II: I I I I : I : : : I I I I I I I I I : I 



II II I I I I I I I I I I I I I I I : 1111111:111 I III : I I'M I 



I I I : I III I I : I I : I I I : I t I I I : : I I I : 



I I : I I I : I : I I I I I :| : I I I I I I I : MUCIN 



I : : I I : : I I II MM 

JIVIYRWTV ST FftAFKWAL I RNN SQVA 514 



II: I I I : : I : : I M Ml I I I : : : : M : I II I : I M MM II 



I M I I II I I : II I I I I I I MUM : : : I II I I II I 
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1 A 



Qy 


663 


Db 


634 


Qy 


721 


Db 


691 


Qy 


781 


Db 


751 


Qy 


817 


Db 


811 


Qy 


872 


Db 


871 



PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTI 720 
I :: II : I ::: I I II II I I I I I I I I I I I : I I I I I I I 

PLIQNWWTR— RKVRQEHGPERKISFPQWEKDYNLQPMNAYGLFDEYLEMILQFGFTTI 690 

FVAACPLAPLFALLNNWVEI RLDARKFVCEYRRPVAERAQDI GI WFH I LAGLTHLAVI SN 780 
I I I I I I I I I I I I I I : I I I I I I III : : I I I : I I I : I I I I I : II I : |: M : I 
FVAAFPLAPLLALLNNIIEIRLDAYKFVTQWRRPLASRAKDIGIWYGILEGIGILSVITN 750 

AFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA RAP 816 

Il::| :|||:|| I : : |::| :|: 



SSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLL 871 

II: : I I I I : I I I : : I : : I I I I I I : I I I I I : I I : I : I 

SEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQFWHVLAARLAFIIVFEHLVFCIKHLISYL 870 

VPDI PESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELSSHW 920 
: I |: I I I I I ::: I I : I : : I : I 

IPDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGKAHHNEW 919 



RESULT 4 
US-11-177-894-7 

; Sequence 7, Application US/11177894 
; Publication No. US20060040292A1 
; GENERAL INFORMATION : 
; APPLICANT: West, et al. 

; TITLE OF INVENTION: Tumor Markers and Uses Thereof 

; FILE REFERENCE: 2002850-0048 

; CURRENT APPLICATION NUMBER: US/11/177,894 

; CURRENT FILING DATE: 2005-07-08 

; NUMBER OF SEQ ID NOS : 29 

SOFTWARE: Patentln version 3.2 
; SEQ ID NO 7 

LENGTH: 960 

TYPE: PRT 

ORGANISM: Artificial 
FEATURE: 

OTHER INFORMATION: Transmembrane protein 
US-11-177-894-7 

Query Match 30.1%;- Score 1488; DB 6; Length 960; 

Best Local Similarity 37.6%; Pred. No. 1.6e-134; 

Matches 363; Conservative 160; Mismatches 307; Indels 136; Gaps 28; 

Qy 26 GLYCRDQAHAERWAMT — SETSSGSHCARSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYG 83 

I I I I I : : : I I : I I I I I I : I 
Db 52 GLYFRDGRRKVDYILVYHHKRPSG NRTLVRRVQHSDTP SGA 92 

Qy 84 STAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET 143 

: I : I : I I I I : I : I III 
Db 93 RSVKQDHPLPGKGASLDAGSGEPP MDYHEDD KRFRREE 130 

Qy 14 4 FLDNLRAAGLCVDQQDVQDGNTTVH YALLSASWAVLCYYAEDLRLKLPLQELPNQAS 200 

: II III:: : I : I : I : : I I I I I I I I : I I : I :: : : 
Db 131 YEGNLLEAGLELE RDEDTKIHGVGFVKIHAPWN VXCREAEFLKLKMPTKKMYH — I 184 

Qy 201 NWSAGLLAWLGI PNVLLEWPDVPPEYYSCR FRVNKLPRFLGSDNQDTFF 250 

1:1111:11:: : I : I I I I I I. : I : I I 

Db 185 NETRGLLK — KINSVLQKITDPIQPKVAEHRPQTMKRLSYPFSREKQHLFDLSD-KDSFF 241 

Qy 251 TSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRL 310 

I I I : : I I I : I I : : I I I I I I I : I I : I I I I I : : 
Db 24 2 DSKTRSTIVYEILKRTTCTKAKYS-MGITSLLANGVYA7VAYPLHDGDY NGENVEF 295 

Qy 311 NQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLV 370 

I I : : I : : I I I : I : I I I I : I 11:11111: I I I I I I I II I : I I :: I I : I I I 

Db 296 NDRKLLYEEWARYGVFYKYQPIDLVRKYFGEKIGLYFAWLGVYTQMLIPASIVGI IVFLY 355 

Qy 371 GC FLVFSD I PTQELCGSKDS FEMC PLC- LDCP FWLL S SACALAQAGRL FDHGGTVFFS LF 4 29 

II : : I I : I : I : : I I I I I I :| : I I I I I I : I III: I I I I I : I 

Db 356 GCATMDENIPSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVF 415 

Qy 4 30 MALWAVLLLEYWKRKSATLAYRWDCSDYEDTEE RPRPQFAA SAPMTAPNPI 4 80 

I I I I I : I : I I I I ! I I I I : : I : I I I I : : I I : I 

Db 416 MALWAATFMEHWKRKQMRLNYRWDLTGFEEEEEAVKDHPRAEYEARVLEKSLKKESRNKE 4 75 
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Qy 481. TGEDEPYFPERSRARRMLAGSWIVVhWAVVVMCLVSI ILYRAIMAIWSRSGNTLIAAW 540 

I I : II I I I : I : I I :: : I : I I II :: : : : : 

Db 476 T — DKVKLTWRDRFPAYLTNLVS 1 1 FMIAVTFAIVLGVI I YRI SMAAALAMNSSPSVRSN 533 

Qy 541 ASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 600 

: I :: I I I I : : I : : I : I I | : I : : I : I I : I I :: I I I I 
Db 534 IRVTVTATAVIINLWI ILLDEVYGCIARWLTKIEVPKTEKS FEERLI FKAFLLKFVNSY 593 

Qy 601 SSPVYIAFFKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELAQELLVIMVGKQVI-NNMQ 658 

: I : I I I I I I I I I I I : I : I I I I I I I I I I :| I : I : I I : I I I : I I I : 
Db 594 T P I FYVAF FKGRFVGR PGDYVY I FRS FRME EC APGGCLME LC I QL S 1 1 MLGKQL I QNN L F 653 

Qy 659 EVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFV 718 

I : I I I : I : : I : : : :l III I II I I : I I : : I I I I I 

Db 654 EIGIPKMKKLIRYLKLKQQSPPDHEECVKRKQRYEVDYNLEPFAGLTPEYMEMI IQFGFV 713 

Qy 719 TI FVAACPLAPLFALLNNWVEI RLDARKFVCEYRRPVAERAQDIGI WFHI LAGLTHLAVI 778 

I : I I I : II I I I I I I I I I : I I I I I I : I I I I I I I I I II : I I I I I : : I I I : I I I I 
Db 714 TLFVAS FPLAPL FALLNN 1 I EI RLDAKKFVTELRRPVAVRAKDI GI WYNI LRGI GKLAVI 773 

Qy 779 SNAFLLAFSSDFLPRA — YYRWTRAHDLRGFLNFTLARAPSSF AAAHN 824 

I I I : : : I : I I I : I I I : : : : I I : I I I III II 
Db 774 I N AFV ISFTSDFIP RL VYL YMY S KNGTMHG FVN HT L SSFNVSDFQNGTAPNDPLDL 829 

Qy 825 RTCRYRAFRD DDGHY — SQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDI 875 

: I I I : : I : : I I : : I : I I I I I I II I I : : : I : : I : : I I I 
Db 830 GYEVQICRYKDYREPPWSENKYDISKDFWAVLAARLAFVIVFQNLVMFMSDFVDWVIPDI 889 

Qy 876 PESVE I KVKREY YLA . KQALAENEVL FGTNGT KDEQP KG 913 

I :::::! I I I I I I I I I I 

Db 890 PKDISQQIHKEKVLMVELFMREEQDKQQLL — ETWMEKERQKDEPPCNHHNTKACPDSLG 947 

Qy 914 SELSSH 919 

I I I 

Db 94 8 SPAPSH 953 



RESULT 5 

US-11-177-894-11 

Sequence 11, Application US/11177894 
Publication No. US20060040292A1 
GENERAL INFORMATION: 
APPLICANT: West, et al. 

TITLE OF INVENTION: Tumor Markers and Uses Thereof 
FILE REFERENCE: 2002850-0048 
CURRENT APPLICATION NUMBER: US/ 11/177, 894 
CURRENT FILING DATE: 2005-07-08 
NUMBER OF SEQ ID NOS: 29 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 11 
LENGTH: 840 
TYPE: PRT 

ORGANISM: Artificial 
FEATURE: 

OTHER INFORMATION: Homo sapiens 
US-11-177-894-11 

Query Match 29.9%; Score 14 79.5; DB 6; Length 840; 

Best Local Similarity 40.0%; Pred. No. 8.9e-134; 

Matches 340; Conservative 152; Mismatches 270; Indels 89; Gaps 22; 

IRTWRETFLDNLRAAGLCVDQQDVQDGNTTVH YALLSASWAVLCYYAEDLRLKLP 191 

i II: II III:: :| : I :| : : I I III I I I: I I : I 



K5LLAWLGIPNVLLEWPDVPPEYYSCR FRVNKLPRFL 241 

III I :| I :: : I : I I I I 



Qy 


135 


Db 


6 


Qy 


192 


Db 


62 


Qy 


242 


Db 


118 


Qy 


302 



I I : I : I I I I I: : I I I : I I : : I I III II : I I : I I I I I 



172 



I I :: I : : I I I : I : I I I I : I I I : I I I II : I I I I I I I II I : I I : 
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X 1 



Db 


173 


Qy 


362 


Db 


. 231 


Qy 


421 


Db 


291 


Qy 


476 


Db 


351 


Qy 


536 


Db 


409 


Qy 


596 


Db 


469 


Qy 


655 


Db 


529 


Qy 


714 


Db 


589 


Qy 


774 


Db 


64 9 


Qy 


825 


Db 


705 


Qy 


871 


Db 


765 


Qy 


912 


Db 


823 



— NGENVE FN DRKLLYEEWARYGVFYKYQP I DLVRKYFGEKI GLYFAWLGVYTQML I PAS 230 

WGTLVFLVGCFLVFSDIPTQELCGSKDSF EMC PLC-LDCPFWLLS S AC AL AQAG RL FD H 420 
:|| :lll II : :||: |:| : : I I I I J I : I * I I I I I I : I III: 
IVGI IVFLYGCATMDENI PSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDN 290 

GGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAA SAPMT 475 

11111:11 MM :|:|||| I II I I : : I : I : I I : : I I 
PATVFFSVFMALWAATFMEHWKRKQMRLNYRWDLTGFEEEEDHPRAEYEARVLEKSLKKE 350 

APNPITGEDEPYFPERSRARRMLAGSWIWMVAVWMCLVSIILYRAIMAIWSRSGNT 535 
: I I I : II I I I : I : II :: :| : I I II : : : : 

SRNKET — DKVKLTWRDRFPAYLTNLVS 1 1 FMIAVTFAIVLGVI I YRI SMAAALAMNSSP 4 08 



I : : I I | I : : I :: I : I I I : I 



FVNFYSSPVYIAFFKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELAQELLVIMVGKQVI 654 
Mil: i : I I I I I I I I I I I : I : I I MM Nihil :| : I I : I I I :| 
FVNSYTPI FYVAFFKGRFVGRPGDYVYIFRSFRMEECAPGGCLMELCIQLSI IMLGKQLI 528 

-NNMQEVLI PKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVL 713 

I I : I : I I I : I : : I : : : : I I I I I II I I : II : : 

QNNLFEIGIPKMKKLIRYLKLKQQSPPDHEECWRKQRYEVDYNLEPFAGLTPEYMEMII .588 

Q FGFVT I FVAAC PLAP L F AL LN NWVE I RL D AR KFVC E Y RR PV AE RAQDIGIWFHI L AGLT 773 
II I I II : M I : I II I I I M I II : I I I I I I : I I I I I I I I I I I : I I I M : : I I I : 
QFGFVTLFVASFPLAPLFALLNNI IEIRLDAKKFVTELRRPVAVRAKDIGIWYNILRGIG 648 

HLAVISNAFLLAFSSDFLPRA — YYRWTRAHDLRGFLNFTLARAPSSF AAAHN 824 

I I I I : I I : : : I : I I I : I I I : : : : MM II III II 
KLAVIIDAFVISFTSDFIPRLVYLYMYSKNGTMHGFVNHTL SSFNVSDFQNGTAPN 704 

RTCRYRAFRD DDGHY — SQTYWNLLAIRLAFVIVFEHWFSVGRLLDL 870 

: I I I : : I : : I I : : I : I I I I I I I I I I : : : I : .: I 
DPLDLGYEVQICRYKDYREPPWSENKYDISKDFWAVLAARLAFVIVFQNLVMFMSDFVDW 764 

LVPD I PESVE I KVKREYYLA KQALAENEVLFGTNGTKDEQP 911 

:: I I I I : : :: :| I Ml I I I I I 



KGSELSSH 919 

M II 



RESULT 6 

US-11-097-143-15228 

; Sequence 15228, Application US/1109714 3 

; Publication No. US20050208558A1 

; GENERAL INFORMATION: 

; APPLICANT: Venter, J. Craig 

; APPLICANT: et al. 

; TITLE OF INVENTION: DETECTION KIT, SUCH AS NUCLEIC ACID 

; TITLE OF INVENTION: ARRAYS, FOR DETECTING EXPRESSION OF 10,000 OR MORE 
; TITLE OF INVENTION: DROSOPHILA GENES. 
; FILE REFERENCE: CL000728 

; CURRENT APPLICATION NUMBER:. US/11/097 , 143 

; CURRENT FILING DATE: 2005-04-04 

; PRIOR APPLICATION NUMBER: 60/157,832 

; PRIOR FILING DATE: 1999-10-05 

; PRIOR APPLICATION NUMBER: 60/160,191 

; PRIOR FILING DATE: 1999-10-19 

; PRIOR APPLICATION NUMBER: 60/161,932 

; PRIOR FILING DATE: 1999-10-28 

; PRIOR APPLICATION NUMBER: 60/164,769 

; PRIOR FILING DATE: 1999-11-12 

; PRIOR APPLICATION .NUMBER: 60/173, 383 

; PRIOR FILING DATE: 1999-12-28 

; PRIOR APPLICATION NUMBER: 60/175,693 

; PRIOR FILING DATE: 2000-01-12 

; PRIOR APPLICATION NUMBER: 60/184,831 

; PRIOR FILING DATE: 2000-02-24 

; PRIOR APPLICATION NUMBER: 60/191,637 

; PRIOR FILING DATE: 2000-03-23 
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A 1 



; NUMBER OF SEQ ID NOS: 4 3008 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 15228 

LENGTH: 1219 

TYPE: PRT 

ORGANISM: DROSOPHILA 
US-11-097-143-15228 



Query Match 29.2%; Score 1445; DB 6; Length 1219; 

Best Local Similarity 35.6%; Pred. No. 3.6e-130; 

Matches 342; Conservative 165; Mismatches 332; Indels 122; Gaps 27; 



Ov 
vy 


35 


AERWAMTSETSSGSHCARSRML RRRAQEEDSTVLIDVSPPEAEKRGSY 


82 


Db 


24 9 


1:1 : 1 II 1 : 1 1 : 1 : 1 : 1 III 
ADRVNQSYEVMESSH SNVLPDQFGYRQLI PTERKASDTASSV SGSY 


294 


Ov 
vy 


83 


GSTAHASEP GGQQAAACRAGSPAKP RIADFVLVW-EEDLKLDRQ 

: 1 1 : II: 1 : 1 1 1 1 I 1 1 1 1 : : 
YGSRKASKSNSLGGESGDERRVSKQDREGLDPESLMFRDGRRKVDMVLAWEEEDLGVMTE 


125 


Db 


295 


354 


Ov 
vy 


126 


QDSAARDRTDMHRTWRET FLDNLRAAGLCVDQQD-VQDGNTTVHYALLSASWAVLCYYAE 
: : 1 1 1 : 1 : : 1 1 1 1 1 : :| 1 I : : 1 : II 

/"IEj/*\i\ i\auh r\i\ 0 r i iDi' ±j x ixuvjuu v uucjijr\.j\£jc l'uiii C't -Ui\ J. ri u t r»rvj_ii_i ± r\uj\i-t 


184 


Db 


355 


407 


Ov 
vy 


185 


DLRLKLP LQELPNQASNWSAGLLAWLGIPNVLLEWPDVPP 

: 1 i 1 1 1: : 1 1 : : 1 1 1 
VMN L KL P VKR F I T I S V KP S WD E EN W LRNMQYWK D VWQ R- LTKKIQLDQTLLE GET 


225 


Db 


408 


462 


Ov 

vy 


22 6 


EYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEG 
: : 1 : 1 : 1 : 1 1 1 1 :| :: :: I : I I : : : 1 1 : 1 : : 1 

TFKAATANnNPFFOF^ VKn-RATAFT^AOR^T.MVMOVT.TRTPFnF^nRc: f;TRRT,MNnn 


285 


Db 


4 63 


519 


Ov 

vy 


28 6 


VLSAAFPLHDGPFKTPPEGPQAPRLN-QRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVA 
lllhl : . : 1 : :: |:||:| II MINI ||:|||:|:| 

TYT.nrFPT.HFnRY DRPH^nT ^T.nRRVT.YOTWAHP^OWYK'KOPT.rT.VRK'YFnnK'T A 


34 4 


Db 


520 


575 


Ov 
vy 


34 5 


L Y FAWLGF YT GWLL P AAWGT L VF LVGC FLVFSD — IPTQELCG — SKDS FEMC PL C - LD 
1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 1 : 1 : 1 : 1 :: 1 : 1 : : 1 1 1 1 
LYFCWLGFYTEMLVYPAWGTLCFIYGLATLESEDNTPSKEICNEYGTGNITLCPLCDKA 


399 


Db 


57 6 


635 


Ov 
vy 


400 


CPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYED 
1 : 1 1 : 1 : : III: 1 1 1 1 : : 1 1 : 1 1 1 1 1 1 1 1 : 1 : 1 1 : ' 1 
CSYQRLSESCLFSRLTYLFDNPSTVFFAIFMS FWATTFLELWKRKQSVLVWEWDLHNV-D 


459 


Db 


636 


694 


Qy 


460 


TEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMI^GSWIv^^AWVMCLVSII 
: 1 1 1 : 1 : 1 1 1 :| 1 1 1 1 : 1 : 1 : :: I :: I I : : : I 
MD EE N R P E FET N AT T F RMN P VT RE KE P YMS TWN R S I RFV I T G S AVL FM I S W L S AVLGT I 


519 


Db 


695 


754 


Ov 
vy 


520 


LYRAIMAIWSRSGNTLLAAWASRIASLTGSVVNLVFILILSKI YVSLAHVLTRWEMHRT 
III : 1 : t : 1 1 : 1 : : : 1 1 1 1 : 1 1 : : 1 1 : 1 II 1 II 
L YRI TL VS V I YGGGGF FVKE HAKL FT S VTAAL I N LWI MI LT RI YH RMAI KLTN LEN PRT 


579 


Db 


755 


814 


Ov 
vy 


580 


QTKFEDAFTLKVFI FQFVNFYSSPVYIAFFKGRFVGYPGNYHT LFGVRNEECAAGG 

1 :: 1 1 : : 1 1 : 1 1 : 1 : 1 1 1 1 1 : 1 1 1 1 1 1 1 1 1 III: 1 : : 1 : 1 1 
HTEYEDSYTFKIFFFEFMNFYSSLIYIAFFKGRFFDYPGDDQARKSEFFRLKNDICDPAG 


635 


Db 


815 


874 


Qy 


636 


CLIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDD 
1 1 1 1 : 1 : 1 1 1 1 1 1 II 1 1 1 1 II: : 1 : : 1 III 
CLSELCIQLAI IMVGKQCWNNFMEYLFPKFWNWWR QRKHKQATKDESHLHMAWEQD 


695 


Db 


875 


930 


Qy 

Db 


696 
931 


YELV-PCE-GLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRR 
1 : 1 1 1 1 1 1 1 1 1 : 1 1 : 1 1 1 1 : j 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 II 1 1 1 1 : 1 1 
YHMQDPGRLALFDEYLEMILQYGFWLFVAAFPLAPLFALLNNVAEIRLDAYKMVTQARR 


753 
990 


Qy 


754 


PVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYR — WTRAHDLRGFLNFT 
1 : 1 1 1 : 1 1 1 1 : 1 1 : 1 : 1 1 : 1 1 1 1 : : 1 : : 1 1 1 : 1 1 1 : : : III:: : 
PLAERVEDIGAWYGILRI ITYTAWSNAFVIAYTSDFI PRMVYKFVYSETHTLAGYIEHS 


811 


Db 


991 


1050 


Qy 


812 


LA RAPSSFAAAHNRTCRYRAFRDDDGHY SQT YWN LLAI RLAFVI VF 

1 : : 1 : 1 1 : 1 1 : h 1 1 1 1 : : 1 1 1 1 1 1 : 1 1 
LSIFNTSDYKEEWGASVSEKDPDTCQYRGYRNGPKDYEPYGLSPHYWHVFAARLAFVWF 


8 57 


Db 


1051 


1110 


Qy 


858 


EHWFSVGRLLDLLVPDI PESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGSELS 
1 1 1 1 1 : : : : : 1 1 : 1 1 : :::| 1 1 1 1 :| : : 1 1 1 : : 
EHWFVITGIMQFI IPDVPSEVKTQMQREQLLAKEAKYQ HGIKRAQGDSQDIM 


917 


Db 


1111 


1163 



http://es/ScoreAccessWeb/GetItem.action?AppId=l 05525 1 5&seqId=775627&ItemName... 1 1/1 7/2006 



A A 



Qy 918 S 918 

I 

Db 1164 S 1164 



RESULT 7 

US-10-484-148-14 

Sequence 14, Application US/10484148 
Publication No. US20040248251A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



LAL, Preeti G.; HONCHELL, Cynthia D.; 
FORSYTHE, Ian J.; CHAWLA, Narinder K. ; 
TANG, Y. Tom; BOROWSKY, Mark L.; BARROSO, Ines; 
YUE, Henry; WARREN, Bridget A.; 
THANGAVELU, . Kavitha; GIETZEN, Kimberly J.; 
AZIMZAI, Yalda; LEE, Ernestine A.; 
BAUGHN, Mariah R. ; GORVAD, Ann E. ; 
DUGGAN, Brendan M.; TRAN, Bao; 
LI, Joana X.; RICHARDSON, Thomas W. ; 
ELLIOTT, Vicki S.; ZEBARJADIAN, Yeganeh 
TRAN, Uyen K. ; YAO, Monique G.; 
PETERSON, David P.; LUO, Wen 
LEHR-MASON, Patricia M. 
TITLE OF INVENTION: RECEPTORS AND MEMBRANE ASSOCIATED PROTEINS 
FILE REFERENCE: PF-1082 USN 
CURRENT APPLICATION NUMBER: US/10/484 , 148 
CURRENT FILING DATE: 2004-01-15 
PRIOR APPLICATION NUMBER: PCT/US02/22833 
PRIOR FILING DATE: 2002-07-16 
PRIOR APPLICATION NUMBER: US 60/306,020 
PRIOR FILING DATE: 2001-07-17 
PRIOR APPLICATION NUMBER: US 60/308,179 
PRIOR FILING DATE: 2001-07-27 
PRIOR APPLICATION NUMBER: US 60/309,702 
PRIOR FILING DATE: 2001-08-02 
PRIOR APPLICATION NUMBER: US 60/311,476 
PRIOR FILING DATE: 2001-08-10 
PRIOR APPLICATION NUMBER: US 60/311,718 
PRIOR FILING DATE: 2001-08-10 
PRIOR APPLICATION NUMBER: US 60/311,551 
PRIOR FILING DATE: 2001-08-10 
PRIOR APPLICATION NUMBER: US 60/314,798 
PRIOR FILING DATE: 2001-08-24 
PRIOR APPLICATION NUMBER: US 60/316,639 
PRIOR FILING DATE: 2001-08-31 
PRIOR APPLICATION NUMBER: US 60/317-, 996 
PRIOR FILING DATE: 2001-09-07 
NUMBER OF SEQ ID NOS : 4 6 
SOFTWARE: PERL Program 
SEQ ID NO 14 
LENGTH: 910 
TYPE: PRT 

ORGANISM: Homo sapiens 
FEATURE: 

NAME /KEY : misc_feature 

OTHER INFORMATION: Incyte ID No: 3718011CD1 
US-10-484-148-14 

Query Match 28.3%; Score 1402.5; DB 5; Length 910; 

Best Local Similarity 38.2%; Pred. No. 3.1e-126; 

Matches 322; Conservative 157; Mismatches 286; Indels 79; Gaps 24; 

RIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET FLDNLRAAGLCVD-QQDVQDGNTT 166 
I I I I I I : I : : : :: : : : | I : : I I | I : : : I I 



5ASWAVLCYYAEDLRLKLPLQELPNQASNWSA — GLLAWLGI PNVLLEWPDVP 224 
I I I I I I I I : : II I I : II I I : III : I : 



Qy 


108 


Db 


67 


Qy 


167 


Db 


123 


Qy 


225 


Db 


179 


Qy 


284 



I I : : i I : I I I I : I : : I I : : I : I I : : I 



I I I I I I I I : II I I : I : I : : I I I lilt : I : I : I I I : 
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Db 


238 


Qy 


344 


Db 


296 


Qy 


400 


Db 


353 


Qy 


458 


Db 


413 


Qy 


516 


Db 


471 


Qy 


568 


Db 


531 


Qy 


627 


Db 


591 


Qy 


687 


Db 


64 9 


Qy 


745 


Db 


704 


Qy 


800 


Db 


764 


Qy 


839 


Db 


820 


Qy 


896 


Db 


880 



SGIYKAAFPLHDCKFRRQSEDPSCP — NERYLLYREWAHPRSIYKKQPLDLIRKYYGEKI 295 

ALYFAWLGFYTGWLLPAAVVGTLVFLVGCFLVFSDIPTQELC-- — GSKDSFEMCPLCLD 399 

: I I I I I I : I I I I I I I I I III : : : I : I II I I I I I 

GIYFAWLGYYTQMLLLAAWGVACFLYGYLNQDNCTWSKEVCHPDIGGK— IIMCPQC-D 352 

— CPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDY 4 57 
IIMI: | : : : I I I I : |: :| |':| I I I : I i I : I I I I I : 



II: I I:: I I II hi h I I I I 



IIILYRAIMAIWSR SGNT LLAAWAS — RI AS LTGS WNLVF I L I LSKI YVS L 567 

I I : I I- : I I I : I : : : I : I I : ': : : . I : I I : I I : 



I : : I : I : I I I I : I : : I : I : I : I I I I I : I I I I I I I I I I : I I I ! I I : : I 
AIMITNFELPRTQTDYENSLTMKMFLFQFVNYYSSCFYIAFFKGKFVGYPGDPVYWLGKY 590 

RN EECAAGGCL I EL AQEL LV IMVGKQVI NNMQEVL I PKLKGWWQKFRL RS KKRKAGAS AG 686 
I I I I I I I I I : I I : I : I I I I : I I : I I I I : I : :| I I 

RNEECDPGGCLLELTTQLTI IMGGKAIWNNIQEVLLPWIMNLIGRFHRVSGSEKITPR — ■ 648 

ASQGPWEDDYELVPCE — GLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDA 744 

II II I I III I I I I I : : I I I I I I : I I I : I I I I I IIMI : I I |: I I 
WEQD YH LQPMGKLGLFYEYLEMI I QFGFVT LFVAS FPLAPLLALVNN I LE I RVDA 703 



799 



1111:111111 I : I : I I I : : I I : : I I : I I : I I t I : 



-RAHDLRGFLNFTLARAPSSFAAA HNRTCRYRAFRDDDGH — 838 

: : : I :: I I I III : : I I I I I I I II 



YSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALA 895 

: : I I : :: I :| I I : I I I I |:: I I ' : : I I : : : I :: t I II : : I 
EYKHNIYYWHVIAAKLAFIIVMEHVIYSVKFFISYAIPDVSKRTKSKIQREKYLTQKLLH 879 



I I : 



RESULT 8 

US-11-097-143-24771 

; Sequence 24771, Application US/11097143 

; Publication No. US20050208558A1 

; GENERAL INFORMATION: 

; APPLICANT: Venter, J. Craig 

; APPLICANT: et al. 

; TITLE OF INVENTION: DETECTION KIT, SUCH AS NUCLEIC ACID 

; TITLE OF INVENTION: ARRAYS, FOR DETECTING EXPRESSION OF 10,000 OR MORE 
; TITLE OF INVENTION: DROSOPHILA GENES. 
; FILE REFERENCE: CL000728 

; CURRENT APPLICATION NUMBER: US/ 1 1/097 , 14 3 

; CURRENT FILING DATE: 2005-04-04 

; PRIOR APPLICATION NUMBER: 60/157,832 

; PRIOR FILING DATE: 1999-10-05 

; PRIOR APPLICATION NUMBER: 60/160,191 

; PRIOR FILING DATE: 1999-10-19 

; PRIOR APPLICATION NUMBER: 60/161,932 

; PRIOR FILING DATE: 1999-10-28 

; PRIOR APPLICATION NUMBER: 60/164,769 

; PRIOR FILING DATE: 1999-11-12 - 

; PRIOR APPLICATION NUMBER: 60/173,383 

; PRIOR FILING DATE: 1999-12-28 

; PRIOR APPLICATION NUMBER: 60/175,693 

; PRIOR FILING DATE: 2000-01-12 

; PRIOR APPLICATION NUMBER: 60/184,831 

; PRIOR FILING DATE: 2000-02-24 • 

; PRIOR APPLICATION NUMBER: 60/191,637 

; PRIOR FILING DATE: 2000-03-23 
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; NUMBER OF SEQ ID NOS : 4 3008 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 24771 

LENGTH: 1075 

TYPE: PRT 

ORGANISM: DROSOPHILA 
US-11-097-143-24771 



Query Match 27.7%; Score 1369.5; DB 6; Length 1075; 

Best Local Similarity 37.4%; Pred. No. 6.4e-123; 

Matches 313; Conservative 163; Mismatches 283; Indels 77; Gaps 20; 
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940 





RESULT 9 

US-11-177-894-10 

; Sequence 10, Application US/11177894 

; Publication No. US20060040292A1 

; GENERAL INFORMATION: 

; APPLICANT: West, et al. 

; TITLE OF INVENTION: Tumor Markers and Uses Thereof 
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; FILE REFERENCE: 2002850-0048 

; CURRENT APPLICATION NUMBER: US/11/177,894 

; CURRENT FILING DATE: 2005-07-08 

; NUMBER OF SEQ ID NOS : 29 

SOFTWARE: Pa ten tin version 3.2 
; SEQ ID NO 10 

LENGTH: 712 

TYPE: PRT 

ORGANISM: Artificial 
FEATURE: 

OTHER INFORMATION: Homo sapiens 
US-11-177-894-10 

Query Match 27.6%; Score 1367.5; DB 6; Length 712; 

Best Local Similarity 41.6%; Pred. No. 5.4e-123; 

Matches 299; Conservative 128; Mismatches 220; Indels 71; Gaps 17; 

259 LFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQ 318 
: : I I I : I I : : I I I I I I I : I I : I I I I I : : I I : : I : : 

2 VYEILKRTTCTKAKYS-MGITSLLANGVYAAAYPLHDGDY NGENVEFNDRKLLYE 55 

319 HWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSD 378 
I I I : I : I I I I : I I I : I I I I I : I I I I I I I II I : I I : : I I : I I I II : : 
56 EWARYGVFYKYQP I DLVRKY FGEKI GLY FAWLGVYT QML I PAS I VG 1 1 VFLYGCATMDEN 115 

379 IPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLL 4 37 

I I : I : I : : I I I I I I : I : I I I I I I : I III: I I I I I : I I I I I I 
116 IPSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVFMALWAATF 175 

4 38 LEYWKRKSATLAYRWDCSDYEDTEE RPRPQFAA SAPMTAPNPITGEDEPYF 4 88 

: I : I I I I I I I I I : : I : I I I I : : I I : I I I : 

176 MEHWKRKQMRLNYRWDLTGFEEEEEAVKDHPRAEYEARVLEKSLKXESRNKET — DKVKL 233 

489 PERSRARRMLAGSWIVVMVAVVVMCLVSIILYRAIMAIWSRSGNTLLAAWASRIASLT 548 

II I I I : i : I I :: : | : | | | | : : : : : | 

234 TWRDRFPAYLTNLVSIIFMIAVTFAIVLGVIIYRISMAAALAMNSSPSVRSNIRVTVTAT 293 

54 9 GSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAF 608 

:: II I I : : I : : I : I I I : I : : I : I I : I I : :| I I I : I : I I 
294 AVI I NLWI I LLDEVYGC IARWLTKI EVPKTEKS FEERLI FKAFLLKFVNSYTP I FYVAF 353 

609 FKGRFVGYPGNYHTLF-GVRNEECAAGGCLIELAQELLVIMVGKQVI-NNMQEVLIPKLK 666 

111111111:1 : I 111111111:11 : I : I I : I I I : I I I : I : I I I : I 
354 FKGRFVGRPGDYVYIFRS FRMEECAPGGCLMELCIQLSIIMLGKQLIQNNLFEIGIPKMK 413 

667 GWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTIFVAACP 726 

: : I : : : MINI II I I : I I : : I I I I I ! : I I I : I 

414 KLIRYLKLKQQSPPDHEECVKRKQRYEVDYNLEPFAGLTPEYMEMI IQFGFVTLFVASFP 473 

727 LAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAF 786 

I I I I II I I I I : I I I I I I : I I I I I I I I I I I : I I I I I : : I I |: I I I I I I I ::: I 
474 LAPLFALLNNIIEIRLDAKKFVTELRRPVAVRAKDIGIWYNILRGIGKLAVI INAFVISF 533 

787 SSDFLPRA — YYRWTRAHDLRGFLNFTLARAPSSF AAAHN RTCR 828 

: I I I : I I I : : : : I |: I I I III II : I I 

534 TSDFI PRLVYLYMYSKNGTMHGFVNHTL SSFNVSDFQNGTAPNDPLDLGYEVQICR 589 

82 9 YRAFRD DDGHY--SQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDI PESVEIKV 883 

I : : I : : I I : : I : I I I I I I I I I I : : : I : : I : : I I I I : : : : 
590 YKDYREPPWSENKYDISKDFWAVLAARLAFVIVFQNLVMFMSDFVDWVIPDIPKDISQEI 649 

884 KREYYLA KQALAENEVLFGTNGTKDEQP KGSELSSH 919 

: I I I I I I I I I I I I I I 

650 HKEKVLMVELFMREEQDKQQLL — ETWMEKERQKDEPPCNHHNTKACPDSLGSPAPSH 705 



RESULT 10 

US-11-097-143-21858 

; Sequence 21858, Application US/11097143 

; Publication No. US20050208558A1 

; GENERAL INFORMATION: 

; APPLICANT: Venter, J. Craig 

; APPLICANT: et al. 

; TITLE OF INVENTION: DETECTION KIT, SUCH AS NUCLEIC ACID 

; TITLE OF INVENTION: ARRAYS, FOR DETECTING EXPRESSION OF 10,000 OR MORE 
; TITLE OF INVENTION: DROSOPHILA GENES. 
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; FILE REFERENCE: CL000728 

; CURRENT APPLICATION NUMBER: US/ 11/097 ,14 3 

; CURRENT FILING DATE: 2005-04-04 

; PRIOR APPLICATION NUMBER: 60/157,832 

; PRIOR FILING DATE: 1999-10-05 

; PRIOR APPLICATION NUMBER: 60/160,191 

; PRIOR FILING DATE: 1999-10-19 

; PRIOR' APPLICATION NUMBER: 60/161,932 

; PRIOR FILING DATE: 1999-10-28 

; PRIOR APPLICATION NUMBER: 60/164,769 

; PRIOR FILING DATE: 1999-11-12 

; PRIOR APPLICATION NUMBER: 60/173,383 

; PRIOR FILING DATE: 1999-12-28 

; PRIOR APPLICATION NUMBER: 60/175,693 

; PRIOR FILING DATE: 2000-01-12 

; PRIOR APPLICATION NUMBER: 60/184,831 

; PRIOR FILING DATE: 2000-02-24 

; PRIOR APPLICATION NUMBER: 60/191,637 

; PRIOR FILING DATE: 2000-03-23 

; NUMBER OF SEQ ID NOS : 4 3008 

; SOFTWARE: FastSEQ for Windows Version 4.0 

; SEQ ID NO 21858 

LENGTH: 1058 

TYPE: PRT 

ORGANISM: DROSOPHILA 
US-11-097-143-21858 

Query Match 24.2%;. Score 1199.5; DB 6; Length 1058; 

Best Local Similarity 33.6%; Pred. No. 2.1e-106; 

Matches 294; Conservative 155; Mismatches 280; Indels 147; Gaps 26; 
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I I : I I I I I I : I I I I : I : I I I I I I I I I I I I : I : I I I I I : I I I I I 
Db 802 AENNSLYVEYLEMWQFGFITLFSLAFPLAPLLALLNNVIEVRLDAIKMLRFLRRPVGMR 861 

Qy 759 AQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAH-DLRGFLNFTLA 813 

I : I I I : I I : : I : I I I : I : : I I I : : : I : I : : I : I I M f I 
Db 862 ARDIGVWHSIMTWTRIAVASSAMIIAFSTNLIPKIVYAASMGDPELNNYLNFTLAVFNT 921 



Qy 814 RAPSSFAAAH-NRT-CRYRAFRD — DDGH-YSQ — TYWNLLAIRLAFVIVFEHWF 862 

I : I I t III II: :| I I : II :| I I I I ::::::: : 
Db 922 KDFQVQPLLGGSQHVNETVCRYTEFRNSPEDPHPYKRPMIYWKILTGRLAFIVIYQNI IT 981 

Qy 863 SVGRLLDLLVPDIPESVEI KVKRE Y Y LA KQ AL AE N E 898 

: : I III: : : : I I I : I :: : I I 
Db 982 MLQGILRWAVPDVSGRLLKRIKRENFLLREHI I EYE 1017. 



RESULT 11 

US-10-104-047-2541 

; Sequence 2541, Application US/10104047 
; Publication No. US20030236392A1 
; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: No. US20030236392Alel full length cDNA 
; FILE REFERENCE: H1-A0105 

; CURRENT APPLICATION NUMBER: US/ 10/104 , 047 

; CURRENT FILING DATE: 2002-03-25 

; PRIOR APPLICATION NUMBER: 

; PRIOR FILING DATE: 

; NUMBER OF SEQ ID NOS : 4 096 

; SOFTWARE: Paten tin Ver. 2.1 

; SEQ ID NO 2541 

LENGTH: 596 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-104-047-2541 

Query Match 23.3%; Score 1154; DB 4; Length 596; 

Best Local Similarity 41.3%; Pred. No. 2.3e-102; 

Matches 250; Conservative 108; Mismatches 194; Indels 54; Gaps 14; 

Qy 357 LL PAAVVGTLVFLVGC FLVFSD I PTQELCGSKDS FEMC PLC- LDCP FWLL S S ACALAQAG 415 

I I I I : I I I I I : : : | : | : | I I I : I Ml I I : I | : 

Db 2 LFPAAFIGLFVFLYGVTTLDHSQVSKEVCQATDI I-MCPVCDKYCPFMRLSDSCVYAKVT 60 

Qy 416 RLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAAS-APM 474 

I I I : I I I I I : : I I I : I I : I I : I I I : I : I I I I I : I : I I I $ I I I : 
Db 61 HLFDNGATVFFAVFMAVWATVFLEFWKRRRAVIAYDWDLIDWEEEEEEIRPQFEAKYSKK 120 

Qy 47 5 TAPNPITGEDEPYFPERSRARRMLAGSWIVVMVAVVVMCLVSI ILYRAIMAIWSRSGN 534 

I I I : I : I I I : I :: : I I : M : : I :: I I : : 
Db 121 ERMNPISGKPEPYQAFTDKCSRLIVSASGI FFMICWIAAVFGIVIYRWTV S 17 3 



Qy 535 TLLA-AWA SRIASLTGSW — NLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDA 586 

I I II | :: | : | | : | | I : : I : : I : I : I I I I I ::: : I :: 

Db 174 TFAAFKVJALIRNNSQVAT-TGTAVCINFCIIMLLNVLYEKVALLLTNLEQPRTESEWENS 232 

Qy 587 ETLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGCLIELAQELL 64 5 

I I I I : I : I I I I I II I I I I I I I I I : I I I I I III I I I I : I : : 

Db 233 FTLKMFLFQFVNLNSSTFYIAFFLGRFTGH PGAYLRLINRWRLEECHPSGCLIDLCMQMG 292 

Qy 64 6 VIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE — G 703 

: I I I I I II I : I : : I I : I : : : | I I I I I I I I 

Db 293 IIMVLKQTWNNFMELGYPLIQNWWTR RKVRQEHGPERKI S FPQWEKDYNLQPMNAYG 34 9 

Qy 704 LFDEYLEMVLQFGFVT I FVAACPLAPLFALLNNWVEI RLDARKFVCEYRRPVAERAQD IG 763 

11111111:11111 I I I I I I I I I I I I I I I I : I I I I I I III : : I I I : I I I : I I I 

Db 350 LFDEYLEMILQFGFTTIFVAAFPLAPLLALLNNI IEIRLDAYKFVTQWRRPLASRAKDIG 409 



Qy 7 64 IWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA 813 

I I : I I I : I : I I : I I I : : I : I I I : I I I : : I : : I : I : 

Db 410 IWYGILEGIGILSVITNAFVIAITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLS 469 

Qy 814 RAPSSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFV 854 

I I : : I I I I : I I I : : I : : I I I I I I : 

Db 470 VFRISDFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQFWHVLAARLAFI 529 
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Qy 855 IVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGS 914 

: I: |:||:|: : :::|| II :: : I |: I : : I 

Db • 530 IVFEHLVFCIKHLISYLI PDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGK 589 

Qy 915 ELSSHW 920 

: I 

Db 590 AHHNEW 595 



RESULT 12 

US-11-072-512-2541 

Sequence 2541, Application US/11072512 
Publication No. US2006002994 5A1 
GENERAL INFORMATION: 
APPLICANT: ISOGAI, TAKAO 
APPLICANT: SUGIYAMA, TOMOYASU 
APPLICANT: OTSUKI, TETSUJI 
APPLICANT: WAKAMATSU, AI 
APPLICANT: SATO, HIROYUKI 
APPLICANT: ISHII, SHIZUKO 
APPLICANT: YAMAMOTO, JUN-ICHI 
APPLICANT: ISONO, YUUKO 
APPLICANT: HIO, YURI 
APPLICANT: OTSUKA, KAORU 
APPLICANT: NAGAI, KEIICHI 
APPLICANT: IRIE, RYOTARO 
APPLICANT: TAMECHIKA, ICHIRO 
APPLICANT: SEKI, NAOHIKO 
APPLICANT: YOSHIKAWA, TSUTOMU 
APPLICANT: OTSUKA, MOTOYUKI 
APPLICANT: NAGAHARI, KEN JI 
APPLICANT: MASUHO, YASUHIKO* 
TITLE OF INVENTION: Novel full length cDNA 
FILE REFERENCE: .084335-0191 
CURRENT APPLICATION NUMBER: US/11/072,512 
CURRENT FILING DATE: 2005-03-07 
PRIOR APPLICATION NUMBER: US 60/350,978 
PRIOR FILING DATE: 2002-01-25 
PRIOR APPLICATION NUMBER: JP 2001-379298 
PRIOR FILING DATE: 2001-11-05 
NUMBER OF SEQ ID NOS: 4 096 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 2541 
LENGTH: 596 
TYPE:' PRT 

ORGANISM: Homo sapiens 
US-11-072-512-2541 

Query Match 23.3%; Score 1154; DB 6; Length 596; 

Best Local Similarity 41.3%; Pred. No. 2.3e-102; 

Matches 250; Conservative 108; Mismatches 194; Indels 54; Gaps 14; 

LL P AA WGT L V F LVGC FL VF S DIPTQELCGSKDSFEMCPLC-LDCPFWLLS S AC AL AQ AG 415 
I I I I : I I I I I : :: | : | : | I I I : I I 1 I I I : I 1 : 

LFPAAFIGLFVFLYGVTTLDHSQVSKEVCQATDI I -MCPVCDKYCPFMRLSDSCVYAKVT 60 

RL FD HGGT VF FS LFMALWAVLL LE YWKRKS AT LA YRWDC S DYEDT E ER PR PQ FAAS - APM 474 
I I I :| I I I I :: I I I : 1*1 : I I : I I I : I : I ! I I |: I: f I Mill : 



I I I : I : III : |:: : I. I : II: : I:: I I : : 
ERMNPISGKPEPYQAFTDKCSRLIVSASGIFFMICWIAAVFGIVIYRWTV S 173 

TLLA-AWA SRIASLTGSW — NLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDA 586 

I I II | : : ! : | | : | | I : : I : : I : I : I I I | | : : : : | : : 



I I I : I : I I I I I II I I I I I I I I I : I I I I I I I I I I I I : I 



VIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCE — G 703 
: I I I I I II I : I : : I I : i :: : I I I i I I I I I 



Qy 


357 


Db 


2 


Qy 


416 


Db 


61 


Qy 


475 


Db 


121 


Qy 


535 


Db 


174 


Qy 


587 


Db 


233 


Qy 


646 


Db 


293 
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X 1 



Qy 704 LFDEYLEMVliQFGFWIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIG 763 

11111111:11111 MINI Mill III :|||||| ||| ::|||:| ||:||| 
Db 350 LFDEYLEMILQFGFTT I FVAAFPLAPLLALLNN 1 1 E IRLDAYKFVTQWRRPLASRAKD IG 409 

Qy 764 IWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA 813 

11:111.: |:||:MI::| :II|:H I.: : |::| :|: 

Db 410 IWYGILEGIGILSVITNAFVIAITSDFIPRLVYAYKYGPCAGQGEAGQKCMVGYVNASLS 4 69 

Qy 814 RAPSSFAAAHNRTCRYRAFRDDDGH YSQTYWNLLAIRLAFV 854 

II: : I I I I : ! I I : : I : : I I I I I I : 

Db 470 VFRISDFENRSEPESDGSEFSGTPLKYCRYRDYRDPPHSLVPYGYTLQFWHVLAARLAFI 529 

Qy 855 IVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENEVLFGTNGTKDEQPKGS 914 

I I I I I : I I : I : I : I I : I : : : : : I I I I : : : | | : I : : I 

Db 530 IVFEHLVFCIKHLISYLI PDLPKDLRDRMRREKYLIQEMMYEAELERLQKERKERKKNGK 589 

Qy 915 ELSSHW 920 

: I 

Db 590 AHHNEW 595 • 



RESULT 13 
US-10-631-467-681 

; Sequence 681, Application US/10631467 

; Publication No. US200502084 96A1 

; GENERAL INFORMATION: 

; APPLICANT: Genox Research Inc. 

TITLE OF INVENTION:' Method for testing for broncheal asthma, or chronic obstructive pulmonary 
; TITLE OF INVENTION: disease 
; FILE REFERENCE: 34 62.1005-000 
; CURRENT APPLICATION NUMBER: US/ 10/631 , 4 67 
; CURRENT FILING DATE: 2003-07-31 
; PRIOR APPLICATION NUMBER: JP 2003-077212 
; PRIOR FILING DATE: 2003-03-20 
; PRIOR APPLICATION NUMBER: JP 2002-229312 
; PRIOR FILING DATE: 2002-08-06 
; NUMBER OF SEQ ID NOS : 208 6 

SOFTWARE: Patentln version 3.1 
; SEQ ID NO 681 
LENGTH: 594 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-631-467-681 

Query Match 21.4%; Score 1061.5; DB 5; Length 594; 

Best Local Similarity 40.4%; Pred. No. 2.3e-93; 

Matches 240; Conservative 103; Mismatches 186; Indels 65; Gaps 15; 



Qy 383 ELCGSKDS FEMC PLC- LDCP FWLL S SACALAQAGRL FDHGGT VFFS LFMALWAVLLLE YW 441 

I : I : : I I I I I I : I : I i I I I I : I III: I I I I I : I I I I II : I : I 
Db 2 EMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVFMALWAATFMEHW 61 

Qy 44 2 KRKSAT LAYRWDCS DYEDTEE RPRPQFAA SAPMTAPNPITGEDEPYFPERS 4 92 

III I I I I I : : I : I I I I : : I I : I I I : I 

Db 62 KRKQMRLNYRWDLTGFEEEEEAVKDHPRAEYEARVLEKSLKKESRNKET — DKVKLTWRD 119 

Qy 4 93 RARRMLAGSWI WMVAWVMCLVS 1 1 L YRAI MAI WS RSGNTLLAAWAS RI AS LTGS W 552 

I I I I : I : I I :: : I : I I I | :: : : : : : I :: 

Db 120 RFPAYLTNLVSIIFMIAVTFAIVLGVIIYRISMAAALAMNSSPSVRSNIRVTVTATAVII 179 

Qy 553 NLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGR 612 

I I I I : : I : : I : I | | : | : : | : I I : I I : : I I i I : I : I I i I I I 

Db 180 NLWIILLDEVYGCIARWLTKIEVPKTEKSFEERLIFKAFLLKFVNSYTPIFYVAFFKGR 2 39 

Qy 613 FVGY PGNYHTLF-GVRN E EC AAGGC L I EL AQ EL L V I MVGKQ V I-NNMQEVLIP K LKGWWQ 670 

II I I I : I : I 111111111:11 : I : |' I : I I I : I I I : I : I I I : I : 

Db 240. FVGRPGDYVYIFRSFRMEECAPGGCLMELCIQLSIIMLGKQLIQNNLFEIGI PKMKKLIR 2 99 

Qy 671 KFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTIFVAACPLAPL 7 30 

: I : : : : I I I I I II I I : I I : : I t I I I I : I I I : I I I I I 

Db 300 YLKLKQQSPPDHEECVKRKQRYEVDYNLEPFAGLTPEYMEMI IQFGFVTLFVASFPLAPL 359 

Qy 731 FALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDF 7 90 

I I I.I I I : I I I I I I : I I I I I I I I I I I : I I I II :: I I I: I I I I I I I : : : I : I I I 
Db 360 FALLNN 1 1 EI RLDAKKFVTELRRPVAVRAKDI GI WYNI LRGI GKLAVI INAFVI SFTS DF 419 
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Qy 791 LPRA--YYRWTRAHDLRGFLNFTLARAPSSF AAAHN RTCRYRAF 832 

: I I I • - • : I I : I I I III II : I I I : : 

Db 4 20 I PRLVYLYMYSKNGTMHGFVNHTL SSFNVSDFQNGTAPNDPLDLGYEVQICRYKDY 4 75 

Qy . 833 RD DDGHY — SQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREY 887 

I : : I I : : I : I I I I I I I I I I :: : I : : ! : : I I I I : : :: : I 
Db 476 REPPWSENKYDI SKDFWAVLAARLAFVIVFQNLVMFMSDFVDWVI PDI PKDI SQQI HKEK 535 

Qy 888 YLA — KQALAENEVLFGTNGTKDEQP KGSELSSH 919 

I I I I I I I I I (I || 

Db 536 VLMVELFMREEQDKQQLL — ETWMEKERQKDEPPCNHHNTKACPDSLGSPAPSH 587 



RESULT 14 
US-11-177-894-8 

Sequence 8, Application US/11177894 
Publication No. US20060040292A1 
GENERAL INFORMATION: 
APPLICANT: West, et al. 

TITLE OF INVENTION: Tumor Markers and Uses Thereof 
FILE REFERENCE: 2002850-0048 
CURRENT APPLICATION NUMBER: US/11/177,894 
CURRENT FILING DATE: 2005-07-08 
NUMBER OF SEQ ID NOS : 2 9 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 8 
LENGTH: 594 
TYPE: PRT 

ORGANISM: Artificial 
FEATURE: 

OTHER INFORMATION: Homo sapiens 
US-11-177-894-8 

Query Match 21.4%; Score 1061.5; DB 6; Length 594; 

Best Local Similarity 40.4%; Pred. No. 2.3e-93; 
. Matches 240; Conservative 103; Mismatches 186; Indels 65; Gaps 15; 

ELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYW 4 41 
1:1 : : Mill I : I : I I I I I I : I III: I I I I I : I I I I I I :|:| 
EMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVFMALWAATFMEHW 61 

KRKS AT LAY RWDC S D Y E DT E E RPRPQFAA SAPMTAPNPITGEDEPYFPERS 4 92 

I I i I I I I I : : I : I I I I : : I i : I I I : I 

KRKQMRLNYRWDLTGFEEEEEAVKDHPRAEYEARVLEKSLKKESRNKET--DKVKLTWRD 119 

RARRML AG S W I WMVAVVVMC L V S I ILYRAIMAI WSRSGNTLLAAWASRI ASLTGS W 552 
I I I I: 1:11 :: :|:|| || :: : : : : : I :: 



III |::| ::| :| ||: |: :|: ||: I I : : I I I I : |:| I II II 



I I I I I : I : I 111111111:11 : I : I I : I I I : I I I : I : I I I : I 



Qy 


383 


Db 


2 


Qy 


442 


Db 


62 


Qy 


493 


Db 


120 


Qy 


553 


Db 


180 


Qy 


613 


Db 


240 


Qy 


671 


Db 


300 


Qy 


731 


Db 


360 


Qy 


791 


Db 


420 


Qy 


833 


Db 


476 


Qy 


888 


Db 


536 



: I I I I I II I I : I I : : I I I I I I : I I I : I I I I I 



I I I I I I : I I I I I I : I I I I I I I I I I I : I I I I I :: I I I : I I I I I I I :: : I : I | I 



*GFLN FT LARA PiSS F AAAHN RTCRYRAF 832 

11:111 III II : I I I : : 

iGFVNHTL SSFNVSDFQNGTAPNDPLDLGYEVQICRYKDY 4 75 



: I : I I I I I i I I I I ::: I : : I : : I I I 



-KQALAENEVLFGTNGTKDEQP KGSELSSH 919 

I I I I III I I I I 1 
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RESULT 15 

US-10-066-543-1424 

Sequence 1424, Application US/10066543 
Publication No. US20030087818A1 
GENERAL INFORMATION: 



APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 
APPLICANT 



Jiang, Yuqiu 
Pyle, Ruth A. 
Xu, Jiangchun 
Indirias, Carol Yoseph 
Lodes, Michael J. 
Secrist, Heather 
Carter, Darrick 
Fanger, Gary R. 
Smith, Carole L. 
Durham, Margarita 
Stolk, John A. 

TITLE OF INVENTION: COMPOSITIONS AND METHODS FOR THE THERAPY 
TITLE OF INVENTION: AND DIAGNOSIS OF COLON CANCER 
FILE REFERENCE: 210121.563 
CURRENT APPLICATION NUMBER: US/ 10/066, 543 
CURRENT FILING DATE: 2002-01-31 
NUMBER OF SEQ ID NOS: 3417 
SOFTWARE: FastSEQ for Windows Version 4.0 
SEQ ID NO 1424 
LENGTH: 782 
TYPE: PRT 

ORGANISM: Homo sapiens 
US-10-066-543-1424 

Query Match 21.0%; Score 1037.5; DB 4; Length 782; 

Best Local Similarity 33.3%; Pred. No. 7.4e-91; 

Matches 284; Conservative 137; Mismatches 282; Indels 14 9; Gaps 28; 

DFVLVWEEDLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCV DQQDVQDGNTT 166 

I : I I ! : : I : I II :: I I : I I I : I I : I I 



II I I I I : : I II 
5VFGLY: RTLLLEPEGPAPHAELAA PTTIPV 114 



Qy 


in 


Db 


35 


Qy 


167 


Db 


80 


Qy 


227 


Db 


115 


Qy 


287 


Db 


147 


Qy 


347 


Db 


191 


Qy 


406 


Db 


250 


Qy 


466 


Db 


310 


Qy 


521 






Db 


*354 


Qy 


581 


Db 


414 


Qy 


640 


Db 


474 


Qy 


696 



III: I : : I III I : : I I 

'SLRIRI— VNFWMNN KTSAGET FEDLMKDGV 14 6 



I I I I I I I I 11: Mil : I I : I : I I I I I I I I I I 

iARFPLHKG EG RLKKT WARWRHMFREQPVDEIRNYFGEKVALY. 190 



1111:11 I : I I I : I I I I I I I : : : I : I : I I I I I 



I I I : I I I : I I I I :: I I I I I I : I I I I I : I : I I ::: : | 



EDEPYFPERSRARRMLAGSWIWMVAVVVMCLV SI IL 520 

: I : I | :|::| : ::::||: ::: 

- NC PDYKLRPYQH S Y LRST VI LV — LT LLMI CLMI GMAH VL W 353 



II I : : | I : : I : | | : | : : | | : | | : I I I I 



I I I :: I II : I I : I I I I II I : I I I I : : I I I I I I : 



)ELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGP WEDD 695 

: : : 1 I I I : : I I I : I | : || :| | : | | : 

'QMAI IMGLKQTLSNCVEYLVP WVTHKCRS — L RASES GH LP RDPELRDWRRN 526 

696 YELVPCE — GLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRR 753 
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I I I I I I I : : I I : : I : i I I I M I I I I I I I II : I I I 1 I I I I I I II 

Db 527 YLLN PVNTFSLFDEFMEMMI QYGFTT I FVAAFPLAPLLALFSNLVE IRLDAI KMVWLQRR 586 



Qy 754 PVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW TRAHD 803 

I : I : I I I I : I : I I I I : I : : I I : I : I : I I I : : : 
Db 587 L V PR KAKD I GT W LQ VL ET I GVL AV I AN GMV I A FT S E F I PR W Y K Y R YS PCLKEGNSTVDC 646 

Qy 804 LRGFLNFTLA RAPSSFAAAHNRT-CRYRAFRD-DDGHYSQTYWNLLAIRLAFV 854 

I : I : : I : I : : I : I I I I I I :| : I : : I : : t I! I I I I I I I 

Db 64 7 LKGYVNHSLSVFHTKDFQDPDGIEGSENVTLCRYRDYRNPPDYNFSEQFWFLLAIRLAFV 706 

Qy 855 IVFEHWFSVGRLLDLLVPDIPESVEIKV-KREYYLAKQALAENEVLFGTNGTKDEQPKG 913 

I : I 1.1 I : : I I I I I : II : I I : : I : : : I I I 

Db 707 ILFEHVALCIKLIAAWFVPDIPQSVKNKVLEVKYQRLREKMWHGRQRLGGVGAGSRPP— 7 64 

Qy 914 SELSSHWTPFTV 925 

: : I I I :: 
Db 765 — MPAHPTPASI 774 



Search completed: October 27, 2006, 20:33:54 
Job time : 195 sees 



SCORE 1.3 BuildDate: 12/06/2005 
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SCORE Search Results Details for Application 
10552515 and Search Result us-10-552-515- 
l-rapbn, 



Score Home 
Page 



Retrieve Application 
List 



SCORE System 
Overview 



SCORE 
FAQ 



Comments / 
Suggestions 



This page gives you Search Results detail for the Application 10552515 and Search Result us-10- 
552-515-1. rapbn. 

start 

Go Back to previous paae 

GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Bioccelera tion Ltd. 



OM protein - protein search, using sw model 



Run on: 



Title: 

Perfect score: 
Sequence : 

Scoring table: 



Searched: 



October 27, 2006, 20:30:53 ; Search time 47 Seconds 

(without alignments) 
1646.261 Million cell updates/sec 

US-10-552-515-1 
4 950 

1 MRMAAT AWAGLQGP PL PT LC SELSSHWTPFTVPKASQLQQ 933 

BLOSUM62 

Gapop 10.0 , Gapext 0.5 
316380 seqs, 82930642 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 



316380 



Post -processing : 



Da tabase 



Minimum Match 0% 
Maximum Match 100% 
Listing first 45 summaries 

Published_Applications_AA_New: * 



/EMC_Celerra_SIDS3/ptodata/l/pubpaa/US09_NEW_PUB.pep:* 

/EMC_Celerra_SIDS3/ptodata/l/pubpaa/US06_NEW_PUB.pep:* 

/EMC_Celerra_SIDS3/ptodata/l/pubpaa/US07_NEW_PUB.pep:* 

/EMC_Celerra_SIDS3/ptodata/l/pubpaa/US08_NEW_PUB.pep:* 

/EMC_Celerra_SIDS3/ptodata/l/pubpaa/PCT_NEW_PUB.pep:* 

/EMC_Celerra_SIDS3/ptodata/l/pubpaa/US10_NEW_PUB.pep:* 

/EMC_Celerra_SIDS3/ptodata/l/pubpaa/USll_NEW_PUB.pep:* 

/EMC_Celerra_SIDS3/ptodata/l/pubpaa/US60_NEW_PUB.pep:* 



Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



Result 
No. 



Score 



% 

Query 

Match Length DB 



ID 



Description 



1 


4950 


100 


0 


933 


6 


US- 


10 


-552 


-515 


-1 


Sequence 


1, Appli 


2 


905 


18 


3 


642 


7 


US- 


11 


-293 


-697 


-4483 


Sequence 


4483, Ap 


3 


684 .5 


13 


8 


483 


7 


US- 


11 


-293 


-697 


-3990 


Sequence 


3990, Ap 


4 


594.5 


12 


0 


660 


7 


us- 


11 


-293 


-697 


-3644 


Sequence 


3644, Ap 


5 


411 .5. 


8 


3 


393 


7 


US- 


11 


-197 


-712 


-457 


Sequence 


4 57, App 


6 


408.5 


8 


3 


876 


6 


us- 


10 


-449 


-902 


-42265 


Sequence 


42265, A 


7 


353 


7 


1 


4 66 


6 


us- 


10 


-449 


-902 


-50939 


Sequence 


50939, A 


8 


283.5 


5 


7 ' 


516 


6 


us- 


10 


-953 


-349 


-5638 


Sequence 


5638, Ap 


9 


283.5 


5 


7 


543 


6 


us- 


10 


-953 


-349 


-5637 


Sequence 


5637, Ap 
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* 1 



10 


283.5 


5 


. 7 


558 


6 


US- 10- 95 3-34 9-5636 


Sequence 


5636, Ap 


11 


105 


2 


. 1 


1089 


6 


US-10-196-749-266 


Sequence 


266, App 


12 


104 


2 


. 1 


697 


6 


US- 10-4 4 9-902-4 9799 


Sequence 


49799, A 


13 


102 


2 


. 1 


442 


7 


US-11-056-355B-8075 


Sequence 


8075, Ap 


14 


101 


2 


. 0 


529 


6 


US- 10-4 4 9-902-4 3353 


Sequence 


43353, A 


15 


101 


2 


. 0 


532 


6 


US- 10-4 4 9-902-52268 


Sequence 


52268, A 


16 


100 


2 


.0 


453 


6 


US- 10-805 -3 94 -6877 


Sequence 


6877, Ap 


17 


100 


- 2 


0 


472 


6 


US- 10-4 4 9-902-53989 


Sequence 


53989, A 


18 


100 


2 


0 


3010 


7 


US-11-140-487A-770 


Sequence 


770, App 


19 


96 .5 


1 


9 


477 


6 


US- 10-4 4 9-902-51479 


Sequence 


51479, A 


20 


96 .5 


1 


9 


477 


6 


US- 10-4 4 9-902-56002 


Sequence 


56002, A 


21 


96 


1 


9 


258 


6 


US- 10-953-34 9-27870 


Sequence 


27870, A 


22 


96 


1 


9 


258 


7 


US-11-056-355B-65384 


Sequence 


65384, A 


23 


95 .5 


1 


9 


4 60 


7 


US-11-056-355B-70833 


Sequence 


70833, A 


24 


95 . 5 


1 


9 


517 


7 


US-11-056-355B-70832 


Sequence 


70832, A 


25 


95.5 


1 


9 


572 


7 


US-11-056-355B-70831 


Sequence 


70831, A 


26 


95 


1 


9 


530 


7 


US-11-296-657-10 


Sequence 


10, Appl 


27 


95 


1 


9 


735 


7 


US-11-366-965-799 


Sequence 


799, App 


28 


95 


1 


9 


3010 


7 


US-11-140-487A-769 


Sequence 


7 69, App 


29 


93.5 


1 


9 


4 66 


7 


US-11-434-137-2794 


Sequence 


27 94, Ap 


30 


93.5 


1 


9 


466 


7 


US-11-434-184-2794 


Sequence 


27 94, Ap 


31 


93.5 


1 


9 


466 


7 


US-11-434-199-2794 


Sequence 


27 94, Ap 


32 


93.5 


1 


9 


466 


7 


US-11-434-203-2794 


Sequence 


27 94, Ap 


33 


93.5 


1 


9 


790 


6 


US- 10-44 9-902-52717 


Sequence 


52717, A 


34 


93 


1 


9 


928 


7 


US-11-056-355B-107694 


Sequence 


107694, 


35 


93 


1 


9 


928 


7 


US-11-056-355B-118933 


Sequence 


118933, 


36 


93 


1 


9 


94 5. 


7 


US- 11-056-355B- 107 693 


Sequence 


107693, 


37 


93 


1 


9 


945 


7 


US-11-056-355B- 118932 


Sequence 


118932, 


38 


93 


1 


9 


971 


7 


US-11-056-355B- 107 692 


Sequence 


107692, 


39 


93 


1 


9 


971 


7 


US-11-056-355B- 118931 


Sequence 


118931, 


40 


92 .5 


1 


9 


424 


7 


US-11-056-355B-17569 


Sequence 


17569, A 


4 1 


92 . 5 


1 


9 


510 


6 


US- 10-4 4 9-902-31 185 


Sequence 


31185, A 


42 


92.5 


1 


9 


613 


7 


US-11-056-355B-17568 


Sequence 


17568, A 


43 


92.5 


1 


9 


635 


7 


US-11-056-355B-17567 


Sequence 


17567, A 


44 


92.5 


1 


9 


703 


7 


US-11-056-355B-71193 


Sequence 


71193, A 


45 


92.5 


1 


9 


715 


7 


US-11-056-355B-71192 


Sequence 


71192, A 



ALIGNMENTS 



RESULT 1 
US-10-552-515-1 

Sequence 1, Application US/10552515 
Publication No. US20060194204A1 
GENERAL INFORMATION: 
APPLICANT: The Government of the United States of America as 
APPLICANT: represented by the Secretary of the Department of Health and 
APPLICANT: Human Services 
APPLICANT : Bera, Tapan K. 
APPLICANT: Pastan, Ira H. 
APPLICANT: Lee, Byungkook 

TITLE OF INVENTION: GENE EXPRESSED IN PROSTATE CANCER AND METHODS OF USE 
FILE REFERENCE: 4 239-68223-02 
CURRENT APPLICATION NUMBER: US/ 10/552 , 515 
CURRENT FILING DATE: 2005-10-06 
PRIOR APPLICATION NUMBER: PCT/US2004/ 10588 
PRIOR FILING DATE: 2004-04-05 
PRIOR APPLICATION NUMBER: 60/461,399 
PRIOR FILING DATE: 2003-04-08 
NUMBER OF SEQ ID NOS: 12 
SOFTWARE: Patentln version 3.2 
SEQ ID NO 1 
LENGTH: 933 
TYPE: PRT 

ORGANISM: Artificial Sequence 
FEATURE: 

OTHER INFORMATION: Splice Variant-Novel Gene Expressed in Prostate 
US-10-552-515-1 

Query Match 100.0%; Score 4 950;' DB 6; Length 933; 

Best Local Similarity 100.0%; Pred. No. 0; 

Matches 933; Conservative 0; Mismatches 0; Indels 0; Gaps 0; 

Qy 1 MRMAAT AW AG LQG P P L PT LC P A VRT GL Y C R DQ AH AERW AMT SETSSGSHCARS RML RRRA 60 

I I I I I I I I I I I I I I I I I I 1 I I I I I I I I I I I I I I I I I I I I I I I I II I I I I I I I I I I I I I I I 
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1 1 



Db 


1 


MRMAAT AWAGLQGPPL PT LC PAVRTGLYCRDQAH AERWAMT S ET SSGS HCARSRMLRRRA 


60 


Qy 


61 


QEEDSTVL I DVS PPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDL 


120 






1 1 1 1 1 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f 1 1 1 1 1 1 1 




Db 


61 


OEEDSTVLIDVSPPFAFKTK^YGSTAHASEPGGOOAAACRAGSPAKPRIADFVLVWEEDL 


120 


Qy 


121 


KLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLC 


180 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 II 1 1 1 1 t 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 




Db 


121 


KLDRCODSAARDRTDMHRTWRETFLDNLRAAGLCVDOODVODGNTTVHYALLSASWAVLC 


180 


Qy 


181 


YYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRF 


240 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 III 




Db 


181 


YYAEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRF 


240 


Qy 


241 


LGSDNQDTFFTSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKT 


300 






1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


24 1 


LGSDNQDTFFTSTKRHQILFEILAKT PYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKT 


300 


Qy 


301 


PP EGPQAP RLNQRQVL FQHWARWGKWNKYQPLDHVRRY FGEKVALYFAWLGFYT GWLL PA 


360 






1 i 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 ! 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


301 


PPF^POAPRLNOROVT.FOHWARWGKWNfCYOPLDHVRRYFGEKVALYFAWLGFYTGWLLPA 


360 


Qy 


361 


AWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDH 


420 






1 I 1 1 1 1 1 I 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


361 


AWGTLVFT.VGCFLVFSDTPTOELCGSKDSFEMCPLCLDCPFWLLSSACALAOAGRLFDH 


420 


Qy 


421 


GGTVFFSLFT^LWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPI 


480 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




UD 




ftfVTVFFClT FMAT WAVT T T FYWRlf^ATT AYRWnr^ nVFnTFF RPRPOFAA A PMTA PM PT 


4 80 


Qy 


481 


TGEDEPYFPERSRARRMLAGSWIVVMVAVVVMCLVSI ILYRAIMAIWSRSGNTLLAAW 


540 






1 1 1 1 1 1 1 1 1 1 1 1 1 i 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 M.I 1 1 1 




Db 


481 


TfiFnFPYFPFR^RARRMT.AG^WTWMVAVWMrT.V^T TLYRATMA T R^fiNTT.T.AAW 
x vJOUEjir x r l ijr\o rv\r\.r\i ixitwj j v v x v v n v r\ v v v i i\-> jj vox x xj x r\Aii ir\ x v v o rvjuii x jj xif\nini 


540 


Qy 


541 


ASRIASLTGSWNLVFILILSKIYVSIAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY 


600 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


54 1 


A^RT A^T.Tn^VVNT.VFTT.TT.^KTYV^T.AHVTTRWFMHRTOTKFFnAFTT.KVFT FOFVNFY 
Ai3r\i Aijiji u j v vitiivc xjjxjjsjrvxx vo J-i/vn v jj x r\r * l-j inrviy i r\.t j_j ur\ c i jjuvl it yc vnr x 


600 


Qy 


601 


SSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEV 


660 






1 1 1 1 1 1 1 i 1 1 1 1 1 i 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




nh 


DU X 


^ S PVY T A F FK"HR FV(TY Pf^NI Y HT T FHVRMFFTA ACGCJ TFT AOFT J VTNTVGKTjVTMNMOFV 


660 


Qy 


661 


LIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTI 


720 






1 1 III 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


661 


T TPWT KfTWWOK'FRT R^KK r RKAf^A < ?Af^A^OnPWFr>nYFTA/PrF(Tr.FnFYT FMVT OFnFVTT 


720 


Qy 


721 


FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHIjjAGLTHLAVISN 


780 






1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 




Db 


721 


FVAArPT.APT.FATJ.NNWVFTRT.nARKFVrFYRRPVAFRAOnTnTWFHTT.AnT.THT.AVT <?N 
c v t\r\<* it xlt\ z Xj r /\jjjj1i1v nvDi rvjj u/\ r\r\ u vvui r\ rvr v r\Eji\f\\£is ivjj. ml n x xuwj Jj x nxiA v x *ji * 


780 


Qy 


781 


AFLLAFSSDFLPRAx^RWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYS 


840 






i 1 1 1 1 1 1 1 1 r 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 




Db 


78 1 


AFT J.AF^^nFT.PRAYYR^RAHnT.R^FT.NFTT.ARAP^^FAAAHNRTrRYRAFRDnnnHY^ 


840 


Ov 


841 


QTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPD I PESVE I KVKREYYLAKQALAENEVL 


900 






1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 1 1 1 1 1 1 1 1 1 1 1 II 1 1 




Db 


841 


QTx^NLIAIRl^FVIVFEHVVFSVGRLLDLLVPDIPESVEIKVKREYYLAKQALAENEVL 


900 


Qy 


901 


FGTN GT KD EQ PKGS EL S S HWT P FT VP KASQLQQ 933 
1 1 I 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 1 1 1 




Db 


901 


FGTNGTKDEQPKGSELSSHWTPFTVP KASQLQQ 933 





RESULT 2 

US-11-293-697-4483 

; Sequence 4483, Application US/11293697 

; Publication No. US20060105376A1 

; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: Novel full length cDNA 

; FILE REFERENCE: HI -AO 106 

; CURRENT APPLICATION NUMBER: US/ 1 1/293 , 697 

; CURRENT FILING DATE: 2005-12-05 

; PRIOR APPLICATION NUMBER: US/ 10/108 , 2 60 

; PRIOR FILING DATE: 2002-03-28 

; NUMBER OF SEQ ID NOS: 54 58 
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1 A 



; SOFTWARE : Pa ten tin Ver. 2.1 
; SEQ ID NO 4483 

LENGTH: 64 2 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-11-293-697-4483 



Query Match 18.3%; Score 905; DB 7; Length 642; 

Best Local Similarity 35.4%; Pred. No. 2.9e-73; 

Matches 222; Conservative 106; Mismatches 219; Indels 80; Gaps 17; 



Qy 


26 


GLYCRDQAHAERWAMT — SET S SGSHCARS RMLRRRAQEEDSTVL I DVS P PEAEKRGS YG 
i 1 1 1 1 : : : 1 1 : 1 1 1 1 1 1 : 1 
GLYFRDGRRKVDYI LVYHHKRPSG NRTLVRRVQHSDTP SGA 


83 


Db 


24 


64 


Qy 


84 


STAHASEPGGQQAAACRAGS PAKPRIADFVLVWEEDLKLDRQQDSAARDRTDMHRTWRET 

: 1 : I : 1 1 1 1 :| : 1 III 
RSVKQDHPLPGKGASLDAGSGEPP MDYHEDD KRFRREE 


143 


Db 


65 


102 


Qy 


144 


FL DN L RAAGL CV DQQD VQ DGNT T VH Y ALL S A SWAV LC YY AE DLRLKLPLQELPNQAS 

: II III :: :| :| :| : : 1 1 III N |:||:| ::: : 
YEGNLLEAGLELE RDEDTK I HGVGFVK I H APWNVLCREAEFLKLKMPTKKMYH — I 


200 


Db 


103 


156 


Qy 


201 


NWSAGLLAWLGI PNVLLEWPDVPPEYYSCR FRVNKLPRFLGSDNQDTFF 

1 : 1 1 1 1 : 1 1 : : : 1 : 1 1 1 1 II :|:|| 
NETRGLLK — KINSVLQKITDPIQPKVAEHRPQTMKRLSYPFSREKQHLFDLSD-KDSFF 


250 


Db 


157 


213 


Qy 


251 


TSTKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRL 

1 1 1 : : 1 1 1 : 1 1 : : 1 1 1 1 1 1 1 : 1 1 : 1 1 1 1 1 : : 
DSKTRSTIVYEILKRTTCTKAKYS-MGITSLLANGVYAAAYPLHDGDY NGENVEF 


310 


Db 


214 


267 


Qy 

Db 


311 
268 


N QRQVL FQHWARWGKWN KYQ P L DH VRRY FGE KVAL Y FAWL G F YT GWLL P AA WGT L V F LV 
1 1 : : 1 :: 111:1": 1 1 1 1 : 1 1 1 :l 1 1 1 1 : 1 1 1 1 i 1 1 II 1 : 1 1 :: 1 1 : M 1 
NDRKLLYEEWARYGVFYKYQPIDLVRKYFGEKIGLYFAWLGVYTQMLIPASIVGIIVFLY 


370 
327 


Qy 

Db 


371 
328 


GCFLVFSDIPTQELCGSKDSFEMCPLC-LDCPFWLLSSACALAQAGRLFDHGGTVFFSLF 
II : :||: |:| : : Mill 1 :| :lllll |:| III: 11111=1 
GCATMDENIPSMEMCDQRHNITMCPLCDKTCSYWKMSSACATARASHLFDNPATVFFSVF 


429 
387 


Qy . 


430 


MALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAA SAPMTAPNPITGED 

1 1 1 1 1 : 1 :l 1 1 I : : | : | : | | : : | | : | | | 

MALWAAT FME HWKRKQMRLN YRWD LTGFEEEEDHP RAE YE ARVL EKSLKKESRN KET — D 


484 


Db 


388 


445 


Qy 


485 


EPYFPERSRARRMLAGSWIVVMVAVVVMCLVSI ILYRAIMAIVVSRSGNTLLAAWASRI 
: II 1 11:1:11 : : : 1 : 1 1 1 1 : : : : : : 
KVKLTWRDRFPAYLTNLVSI IFMIAVTFAIVLGVI IYRISMAAALAMNSSPSVRSNIRVT 


544 


Db 


446 


505 


Qy 


545 


ASLTGSVVNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPV 

: 1 ::||| |::| ::| • :| II: |: :|: II: 1 1: :IM h 
VTATAVIINLWIILLDEVYGCIARWLTKIEVPKTEKS FEERLIFKAFLLKFVNSYTPIF 


604 


Db 


506 


565 


Qy 


605 


YIAFFKGRFVGYPGNYHTLF-GVRNEE 630 

1 : 1 1 1 1 1 1 1 1 1 1 1 : 1 : 1 III 

YV AF FKGR FVGR PG D YVY I F RS FRME E 592 




Db 


566 




RESULT 3 
US-11-293 


-697- 


-3990 





Sequence 3990, Application US/11293697 
Publication No. US20060105376A1 



; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: Novel full length cDNA 

; FILE REFERENCE: H1-A0106 

; CURRENT APPLICATION NUMBER: US/ 1 1/293, 697 

; CURRENT FILING DATE: 2005-12-05 

; PRIOR APPLICATION NUMBER: US/ 10/108 , 260 

; PRIOR FILING DATE: 2002-03-28 

; NUMBER OF SEQ ID NOS: 54 58 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 3990 

LENGTH: 483 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-11-293-697-3990 
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1 1 



Query Match 13.8%; Score 684.5; DB 7; Length 483; 

Best Local Similarity 35.1%; Pred. No. 1.9e-53; 

Matches 171; Conservative 95; Mismatches 168; Indels 53; Gaps 14; 



Qy 


479 


PITGEDE PYFPERSRARRMIAGSVVIVVMVAVVVMCLV SI ILYRAIM 

1 |: 1 1 : 1 1 : 1 :: 1 : ::: : I I : :::| | : 
PAVSEEEMALQLINCPDYKLRPYQHSYLRSTVILV — LTLLMICLMIGMAHVLWYRVLA 


525 


Db 


2 


59 


Qy 


526 


AIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFED 
: : 1 1 1 : : 1 1 : : 1 : 1 1 : 1 : : 1 1 : 1 1 : 1 1 1 1 : : 1 
SALFSSSAVPFLEEQVTTAVWTGALVHYVTIVIMTKINRRVALKLCDFEMPRTFSERES 


585 


Db 


60 


119 


qy 


586 


AFTLKVFI FQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGV-RNEECAAGGCLIELAQEL 
1 1 : : 1 II : 1 1 : 1 1 1 1 II 1 : 1 1 1 1 : : 1 1 1 1 1 1 : : : 1 : : 
RFTIRFFTLQFFTHFSSLIYIAFILGRINGHPGKSTRLAGLWKLEECHASGCMMDLFVQM 


644 


Db 


120 


179 


Ov 


64 5 


LVIMVGKQVINNMQEVLI PKLKGWWQKFRLRSKKRKAGASAGASQGP WEDDYELVP 


700 


Db 


180 


:|| II ::l 1 |:| 1 : II :| 1 : 1 1 :| I 1 
AIIMGLKQTLSNCVEYLVP WVTHKCRS — LRASESGHLPRDPELRDWRRNYLLNP 


232 


Ov 


701 


CE — GLFDEYLEMVLQFGFVTI FVAACPLAPLFALLNNWVEI RLDARKFVCEYRRPVAER 
1 1 1 1 : : I 1 : : 1 : 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 : 1 1 1 1 1 1 II 1 1 ill : 
VNTFSL FD E FMEMM I QYG FTTI FVAA FPLA PL LAL F SN LVE I RL DA I KMVWL QRRLVP RK 


7 58 


Db 


233 


292 


Qy 


759 


AQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRW TRAHDLRGFL 

1 : 1 1 1 1 : 1 : 1 1 1 1 : 1 : : I I : I : I : I I I : : : I : I :: 
AKD I GT WL QVLET I GVLAV I AN GMVT AFT S E F I P RWYKY RY S PCL KE GN ST VDCL KGYV 


808 


Db 


293 


352 


Ov 


809 


NFTLA RAPS SFAAAHNRT-CRYRAFRD-DDGH YSQT YWNLLAI RLAFVIVFEH 

1 : 1 : : 1 : 1 1 1 1 1 1 : 1 : 1 : : 1 : H 1 1 1 1 1 1 1 1 1 1 : 1 1 1 
NHSLSVFHTKDFQDPDGIEGSENVTLCRYRDYRNPPDYNFSEQFWFLLAIRLAFVILFEH 


859 


Db 


353 


412 


Qy 


860 


WFSVGRLLDLLVPDI PESVEI KV-KREYYLAKQALAENEVLFGTNGTKDEQPKGSELSS 
1 : : 1 1 1 1 !: 1 1 : 1 I : : 1 :: : II 1 : : 
VALC I KL I AAWFVP D I PQSVKN KVLE VKYQRLRE KMWHGRQRLGGVGAGS RP P MPA 


918 


Db 


413 


468 


Qy 


919 


HWTPFTV 925 
III:: 
HPTPASI 47 5 




Db 


469 





RESULT 4 

US-11-293-697-3644 

; Sequence 3644, Application US/11293697 
; Publication No. US20060105376A1 
; GENERAL INFORMATION: 

; APPLICANT: HELIX RESEARCH INSTITUTE 

; TITLE OF INVENTION: Novel full length cDNA 

; FILE REFERENCE: H1-A0106 

; CURRENT APPLICATION NUMBER: US/ 11/293, 697 

; CURRENT FILING DATE: 2005-12-05 

; PRIOR APPLICATION NUMBER: US/ 10/108 , 260 

; PRIOR FILING DATE: 2002-03-28 

; NUMBER OF SEQ ID NOS : 5458 

; SOFTWARE: Pa tent In Ver. 2.1 

; SEQ ID NO 3644 

LENGTH : 660 

TYPE: PRT 

ORGANISM: Homo sapiens 
US-11-293-697-3644 

Query Match 12.0%; Score 594.5; DB 7; Length 660; 

Best Local Similarity 24.1%; Pred. No. 4.1e-45; 

Matches 171; Conservative 122; Mismatches 231; Indels 185; Gaps 20; 

Qy 228 YSCRFRVNKLPRFLG-SDNQDTFFTSTKRHQILFEILAKT PYGHEKKNLLG 277 

: : I I I I I I I I I I : I : I I : II 

Db 100 FTYRTRQN FKGFDDNNDDFLTMAECQFI I KHELENLRAKDEKMIPGY 146 

Qy 278 IHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQVLFQHW-ARWGK 325 

: : I I I : : I I I I I : I I I : 

Db 14 7 PQAKLYPGKSLLRRLLTSGIVIQVFPLHDS EALKKLEDTWYTRFAL 192 

Qy 326 WNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELC 385 

I I I I : I : I I I I I : I I I I : I :: I I : I I I: I 
Db 193 — KYQPIDSIRGYFGETIALYFGFLEYFTFALIPMAVIG 229 
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1 1 



Qy 


386 


GSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKS 

1 : : I 1 : | |: 1 : I : : : | | | | | 
LPYYLFVWE DYDKYVI FAS FNLI WSTVI LELWKRGC 


445 


Db 


230 


265 


Qy 


446 


ATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIV 
1:111 : 1 1 1 1 1 : 1 1 1 1 : : 1 1 : 1 1 1 : 1 : 1 
ANMTYRWGTLLMKRKFEEPRPGFHG VLGINSITGKEEPLYPSYKRQLRI YLVSLPFV 


505 


Db 


266 


322 


Qy 


506 


VMVAVWMCLVS 1 1 LYRAIMAI WSRSGNT LLAAWASR I ASLTGSWNLVFI L I LSKI YV 


565 


Db 


323 


: : : : 1 : 1 : : : : | | : I : : : | | : : : : | 
CLCLYFSLYVMMIYFDMEVWALGLHENSG SEWTS-VLLYVPSIIYAIVIEIMNRLYR 


378 


Qy 


566 


S LAH VLT RWEMH RT QT K FED AFT L KVFI FQ FVN FYS S PVY I A FFKGRFVG YPGN YHT L FG 

1 II II 1 ! : : : : : IIS : 1 1:1 : : I 1 1 1 1 
YAAEFLTSWENHRLESAYQNHLILKVLVFNFLNCFASLFYIAFV -— 


625 


Db 


379 


422 


Qy 


626 


VRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLIPKLKGWW— QKFRLRSKKRKAGA 
: : : : ill : : : 1 : : 1 : 1 : 1 : 1 : 1 : 1 1 : : 


683 


Db 


423 


• iii ••• i • • i • i *i *i "i • i i • • 
LKDMKL LRQSLATLLITSQILNQIMESFLP YWLQRKHGVRVKRKVQAL 


470 


Qy 


684 


SAGASQGPWED DYELVPCEGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEI 

1 : 1 : 1 : 1 1 1 : 1 1 1 : 1 I 1 ! : 1 : : 1 III 11:111:1: 
i *i ♦ i* i ii*iii* iiii*i»*i iii ii*iii* i* 

KADIDATLYEQVILEKE^GTYLGTFDDYLELFLQFGYVSLFSCVYPLAAAFAVLNNFTEV 


740 


Db 


471 


530 


Qy 


741 


RLDARKFVCEYRRPVAERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWTR 
NSDALKMCRVFKRPFSEPSANIGVWQLAFETMSVISWTNCALIGMSPQV— NAVFPESK 


800 


Db 


531 


588 


Qy 


801 


AHDLRGFLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYSQTYWNLLAIRLAFVIVFEHV 
III : I : I I 
A-DL IL I WAVE HA 


860 


Db 


589 


601 


Qy 


861 


VFSVGRLLDLLVPD I PES VE I KVKRE YYLAKQALAENEVL FGTNGT KDE 90 9 
: :: :| : 1 1 1 : : : 1 : 1 : : : I I : :: I I : 1 
LLALKFILAFAI PDKPRHIQMKLARLEFESLEALKQQQMKLVTENLKEE 650 




Db 


602 




RESULT 5 
US-11-197 


-712- 


-457 





; Sequence 457, Application US/11197712 
; Publication No. US200601 30160A1 
; GENERAL INFORMATION: 

; APPLICANT: Dumas Milne Edwards,' Jean Baptiste 
; APPLICANT: Bougueleret, Lydie 
; APPLICANT: Jobert, Severin 

; TITLE OF INVENTION: FULL-LENGTH HUMAN cDNAs ENCODING POTENTIALLY SECRETED PROTEINS 

; FILE REFERENCE: 78.US4.CIP 

; CURRENT APPLICATION NUMBER: US/11/197,712 

; CURRENT FILING DATE: 2005-08-04 

; PRIOR APPLICATION NUMBER: US/09/876,997 

; PRIOR FILING DATE: 2001-06-08 

; PRIOR APPLICATION NUMBER: US 09/731,872 

; .PRIOR FILING DATE: 2000-12-07 

; PRIOR APPLICATION NUMBER: US 60/187,470 

; PRIOR FILING DATE: 2000-03-06 

; PRIOR APPLICATION NUMBER: US 60/169,629 

; PRIOR FILING DATE: 1999-12-08 

; NUMBER OF SEQ ID NOS: 4 82 

; SOFTWARE: Patent. pm 

; SEQ ID NO 4 57 

LENGTH: 393 

TYPE: PRT 

ORGANISM: Homo sapiens ' . 

US-11-197-712-457 

Query Match 8.3%; Score 411.5; DB 7; Length 393; 

Best Local Similarity 23.8%; Pred. No. 7.4e-29; 

Matches 111; Conservative 95; Mismatches 172; Indels 89; Gaps 11; 

Qy 44 8 LAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWM 507 

: I I I : I III I : I I I I : : I I : I I I : I : I : 

Db 1 MTYRWGTLLMKRKFEEPRPGFHG VLGINSITGKEEPLYPSYKRQLRI YLVSLPFVCL 57 

Qy 508 VAVWMCLVS 1 1 LYRAIMAI WSRSGNTLLAAWASRI ASLTGSWNLVFI LI LSKI YVSL 567 

: :: I : |: : : : I I : I:: : I |::::| 
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t 1 



Db 


58 


Qy 


568 


Db 


114 


Qy 


628 


Db 


158 


Qy 


686 


Db 


206 


Qy 


743 


Db 


266 


Qy 


803 


Db 


323 


Qy 


863 


Db 


337 



58 CLYFSLYVMMIYFDMEVWALGLHENSG SEWTS-VLLYVPSI IYAIVIEIMNRLYRYA 113 

MVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVR 627 
I MUM::::: III : I I : I : : I I I I I :: 



-LK 157 



III ::: |::| : I :| :| :| :| |:: I 
-LRQSLATLLITSQILNQIMESFLP YWLQRKHGVRVKRKVQALKA 205 



I I I : I I I : I I I I : I : : I I I I I I : I I I : I : 



III : : | | : | : : | I : I : : : : I : : I I : I I : : : I. 



: I : II: 
- 1 L I WAVE HALL 336 



III : : : I : I : : : I I : :: I I : I 



RESULT 6 

US-10-449-902-42265 

Sequence 42265, Application US/10449902 
Publication No. US20060123505A1 
GENERAL INFORMATION: 
APPLICANT: National Institute of Agrobiological Sciences. 
APPLICANT: Bio-oriented Technology Research Advancement Institution. 
APPLICANT: The Institute of Physical and Chemical Research. 
APPLICANT: Foundation for Advancement of International Science. 
TITLE OF INVENTION: FULL-LENGTH PLANT cDNA AND USES THEREOF 
FILE REFERENCE: MOA-A0205Y1-US 
CURRENT APPLICATION NUMBER: US/ 10/4 49, 902 
CURRENT FILING DATE: 2003-05-29 
PRIOR APPLICATION NUMBER: JP 2002-203269 
PRIOR FILING DATE: 2002-05-30 
PRIOR APPLICATION NUMBER: JP 2002-383870 
PRIOR FILING DATE: 2002-12-11 
NUMBER OF SEQ ID NOS : 567 91 
SOFTWARE: Patentln Ver. 2.1 
SEQ ID NO 42265 
LENGTH: 876 
TYPE: PRT 

ORGANISM: Oryza sativa 
US-10-449-902-42265 

Query Match 8.3%; Score 408.5; DB 6; Length 876; 

Best Local Similarity 22.4%; Pred. No. 4.2e-28; 

Matches 198; Conservative 110; Mismatches 285; Indels 291; Gaps 39; 

Qy 73 PP EAEK — RGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEEDLK — L 122 

II | : | : : | | I : : I : I I I : I I I : : II 

Db 28 PPIQSALHSAQKAAKDAVGSARDATSSSSRTTEYPVPQSHPDSQIADYVLVFQHIAKKYL 87 

Qy 123 DRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYY 182 

I : I: : : : : : I I I I ■ :: I I : I : 

Db 88 RS ST KV PAAE RS KI AAE Y - DALVQR I RDT GLHVT S REGS KG SGQILLFV 135 

Qy 183 AEDLRLKLPLQELPNQ — ASNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKL — P 238 

I : I I : I I | :: I : I : I II I : I 

Db 136 KADSQL LHQLARQEALSDYLHGVLS VQPPPP RSSSLADA 174 

Qy 239 RFLGSDNQDT FFTSTKRHQI LFEI LA-KT PYGHEKKNLLGI HQLLAEGVLSAA FP— 292 

I I I ::: :| : I I I I I: : I II 

Db 175 SFQKQQPSSLHLTPASRLRLVDSLLTLPSLASHAVKNAAGSSQVPSGAGLRLGLKEFPHL 234 

Qy 293 LHDGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQP LDHVR 336 

: I I II | : : | : | I I : I 

Db 235 VDMSAIHD PAYNSA WMK — RWSHTS PAKLFSGI GLADLDS I R 274 
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1 1 



Qy 


jj / 


RY FGEKVALY FAWLGFYT GWLL PAAWGTLVFLVGC FLVFSDI PTQELCGS KDS FEMC PL 
: 1 1 1 Mill : 1 II 1 1 1 1 : : 1 

rurrrn\/BT vrrn wrvrosT sdsst tt ___________ ____ 


396 


Db 


£.10 


in? 


Qy 


TQT 

J3 / 


Mil III : Ml : I: 1 M 1: 1 II II 

T 7V FT.n T ^DDr CDWCT f"*T \7CUCPT U'WE'T HDMVTDl^T R\/DHrTT C 


4 JO 


Db 


303 


1 A C 

jh b 


Qy 


4 57 


YEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWI WMVAW 

: ! 1 I 1 Mill 1 II:: :: 1 : : 


Dl 1 


Db 


34 7 


ucro/nDo Dunn/DDT to Tr\Dii , rirc , Dirir\/irc*iATtjJD , DC , T d\/tt ct DT\/iiirir , _c\/T ttQTMT' 




Qy 


512 


VMCLVSII LYR AIMAIWSRSGNTLLAAWASRIASLTGSWNLVF 

MM! II 1:: : 1 : ::| II 

t ii/n n. rr t t— t fp at vurni i/r\7\ mdct dt a t t \n/7\'\/rj/^ t T A ti T»i/~\ 


t; p. a 
DOb 


Db 


404 


AAA 
H 4 4 


Qy 


557 


ILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFY SSPVYI 

': 1 MMI 1 : :: : III 1 1 : 1 M Ml 

7\T7\W7\ TTVT«7T?KTIJLIC7\ VCVHVCT TT WD tr*7\M/^i7i TTSVraT TT C3VVTVT DIT/TC 


bUb 


Db 


a a a 
4 4 5 


h y d 


Qy 


607 


AFFKGR FVGYPGNYHTLFGVRNEECAAGGCLI ELAQ 

II: 1 1 1 1 

TMrTur/^DrrrDnc t au j\ T dtpm t c d \cc~ Tnru TM DHDMUT ____________________ 


C A O 

bQ _ 


Db 


a q a 
4 9b 




Qy 


64 3 


ELLVIMVGKQVINNMQEVLI PKLKGWWQKFR LRSKKRKAGASA 

M : I I :| |: M 1 : ::| 1 :|: II: 
QLFAVSVTSQFVNAFTELAL PVLMRKFAEWRGERAAQKNADGSSQRQPQRQDbAbSSAPb 


boo 


Db 


536 


cnc 
j"j 


Qy 


686 


--GASQGPWED DYELVPCEGLFDEYLEMVLQFGFVTI FVAAC 

II: 1 I: : :| 1 : II :l II Ml::|:: 
SGDWPLEPGAADGGKEESERRFVSRWKELQLPPYD-LFGDYAEMATQFGYITLWSVVW 


1 o c 


Db 


596 


654 


Qy 


726 


PL AP L FAL LN NWVE I RL D AR KFVC E Y RR PVAE RAQD I G I W FH I LAG LT HL AV I S N A F 

MM: : M : 1 M II 1 1 1 II It: 11 I : 1 : : 1 1 : 
P L S P VMG FVN N F FEL R S D AA K I S VN N RR PV P VRAET IGPWLEAFGFI AWL GALN N AAL VY 


782 


Db 


655 


714 


Qy 


783 


LLAFSSDFLPRAYYRW TRAHDLRGFLNFTLA-RAPSSFA 820 

1 1 : 1 : lit 1 II 1 I 1 1 : 
LFQQSEHAHLEGHSRYETTMRTHLHPSRA — NLTLADDAQSRFS 756 




Db 


715 





RESULT 7 

US-10-449-902-50939 

; Sequence 50939, Application US/10449902 
; Publication No. US20060123505A1 
; GENERAL INFORMATION: 

; APPLICANT: National Institute of Agrobiological Sciences. 

; APPLICANT: Bio-oriented Technology Research Advancement Institution. 

; APPLICANT: The Institute of Physical and Chemical Research. 

; APPLICANT: Foundation for Advancement of International Science. 

; TITLE OF INVENTION: FULL-LENGTH PLANT cDNA AND USES THEREOF 

; FILE REFERENCE: MOA-A0205Y1-US 

; CURRENT APPLICATION NUMBER: US/10/449,902 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: JP 2002-203269 

; PRIOR FILING DATE: 2002-05-30 

; PRIOR APPLICATION NUMBER: JP ' 2002-383870 

; PRIOR FILING DATE: 2002-12-11 

; NUMBER OF SEQ ID NOS: 56791 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 50939 

LENGTH: 4 66 

TYPE: PRT * 

ORGANISM: Oryza sativa 
US-10-449-902-50939 

Query Match 7.1%; Score 353; DB 6; Length 466; 

Best Local Similarity 23.8%; Pred. No. 1.9e-23; 

Matches 137; Conservative 82; Mismatches 193; Indels 164; Gaps 20; 

Qy 346 YFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLL 405 

I I :: I I I I I I I I I I: I : lh 
Db 13 YFSFLGMYTRWLFFPAVFGLATQLID FGSLQ WLV 4 6 

Qy 406 SSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRW DCSDYEDTE 4 61 

I i I : I I I I : : I I I I : : : I I I : I : 
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1 1 



Db 


4 7 


Ov 
vy 


462 


Db 


90 


Ov 
vy 


516 


Db 


14 6 


Ov 
vy 


57 6 


Db 


194 


Ov 

vy 


636 


nh 

UJJ 


£.3 X 


vy 


692 


UD 




Ov 

vy 


74 2 


UD 




vy 




Db 


392 


Qy 


862 


Db 


418 



- FFFFVI SWAVFFLQFWKRKN SAVLARWG INYSFSEYKTMG 89 

rGEDEPYFPERSRARRMLAGSWI WMVAVWMCL 515 

I : |:| :| |:: ::::|:: : I 

PGAPK EKSIVQRNEWFGVLLRIRNNAI IVLAIICLQL 145 



: I I | | | ::| :: | ::| 

fVLTA VYLAAIQYYTRIGGKVSVTLIKYE 193 



III I III I : I II 
fKVF GLYFMQSYIGLF YHASLH RN 230 



:: I I I: :: I I : I : I III ::|:| : II : : II I 
- IMALRQVLI KRL I VSQVLENLI ENS I PYLNYS YKKYRAVHKKKHEKES PAGKSVRLSTR 289 

^EDDYELVPC EGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIR 741 

l-l I I : I I I I : : I I : I I I I : : ! II II Ml III 



III: : I I I I I I : I I : I : : I II 
VDALFCLLVMLKRPAPRDAATIGAWLNIFQFLWMAICTNCLLL 391 



I : : I : : I I :: : I I : 

-DEEGKWK IEPGLAAILIMEHAL 417 



II: I : I I I : I : Ml 



RESULT 8 

US-10-953-349-5638 

Sequence 5638, Application US/10953349 
Publication No. US20060107345A1 
GENERAL INFORMATION: 
APPLICANT: ALEXANDROV, Nickolai et al. 

TITLE OF INVENTION: SEQUENCE-DETERMINED DNA FRAGMENTS AND CORRESPONDING POLYPEPTIDES 
TITLE OF INVENTION: ENCONDED THERBY 
FILE REFERENCE: 2750-1579PUS2 
CURRENT APPLICATION NUMBER: US/ 10/953 , 34 9 
CURRENT FILING DATE: 2004-09-30 
NUMBER OF SEQ ID NOS: 4 0252 
SOFTWARE: Patentln version 3.3 
SEQ ID NO 5638 
LENGTH: 516 
TYPE: PRT 

ORGANISM: Arabidopsis thaliana m 
US-10-953-349-5638 

Query Match 5.7%; Score 283.5; DB 6; Length 516; 

Best Local Similarity 23.4%; Pred. No. 4.2e-17; 

Matches 127; Conservative 78; Mismatches 167; Indels 171; Gaps 22; 

r F FS L FMALWAVLL LE YWKRKS AT L AYRWDCSDYEDTEERPRPQFAASAPMTAP 4 77 

I I I I I :: I I I I : I I I I : :: I 
LWAAL FLQ FWKRKN AALL AS QGYR FLGMEWS S L PFPKELIKNLG 154 



Qy 


422 


Db 


108 


Qy 


478 


Db 


155 


Qy 


533 


Db 


208 


Qy 


593 


Db 


237 


Qy 


650 



III I I I I ::| :::::: I : | | |: | 

CEKEAYQRYEWFAYRKRFRN DVLVIMSIICLQLPFELAYAHIFEIITSD- 207 



: : I : I I : : III : : I : 

-IIKYVLTAIYLLIIQYLTR LGGKVSVKLI 236 



— IFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMV 64 9 

I : I : :: : : I I Ml II : | | |: :: 

*EINESVEYRANSLIYKTYIGIF YHVLLH-RN FMTLRQVLIQRLI 281 

{QVINNMQEVLIPKLKGWWQKFRLRSKKR-KAGASAGASQ — GPWEDDY 696 

M : : Mil : : I : I I : I I : : I : I I I I : I 
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1 1 



UD 


9fl 9 




697 


JJD 




Ov 


756 


nu 
1JD 


inn 


wy 


816 


Db 


4 z y 


yy 


O / D 


Db 


469 


Qy 


930 


Db 


509 



-ELVPCEGLFDEYLEMVLQFGFVTI FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPV 755 
II :| II I: I I: I II I : :| I III I ::l MM :| I : IN: 



I III:! I ::::!: I I 
\AATIGAWLNIWQFLVVMSICTNSALL 428 



II I : I : : || ::: |||: : I III: 

— VCLY DQEGKWK IEPGLAAILIMEHVLLLLKFGLSRLVPEE 468 

PESVEI-KVK REYYLAKQALAENEVLFGTNGTKDEQPKGSELSSHWTPFTVPKAS 929 

I I Ml :: I II I :| : I I: 

PAWVRASRVKNVTQAQDMY-CKQLL RSISGEFNSLTKPEQE 508 



I I 



RESULT 9 

US-10-953-349-5637 

Sequence 5637, Application US/10953349 
Publication No. US2006010734 5A1 
GENERAL INFORMATION: 
APPLICANT: ALEXANDROV, Nickolai et al. 

TITLE OF INVENTION: SEQUENCE-DETERMINED DNA FRAGMENTS AND CORRESPONDING POLYPEPTIDES 
TITLE OF INVENTION: ENCONDED THERBY 
FILE REFERENCE: 2750-1579PUS2 
CURRENT APPLICATION NUMBER: US/ 10/953, 34 9 
CURRENT FILING DATE: 2004-09-30 
NUMBER OF SEQ ID NOS: 40252 
SOFTWARE: Patentln version 3.3 
SEQ ID NO 5637 
LENGTH: 54 3 
TYPE: PRT 

ORGANISM: Arabidopsis thaliana 
US-10-953-349-5637 

Query Match 5.7%; Score 283.5; DB 6; Length 54 3; 

Best Local Similarity 23.4%; Pred. No. 4.6e-17; 

Matches 127; Conservative 78; Mismatches 167; Indels 171; Gaps 22; 

'F FS L FMALWAVLL LE YWKRKS AT L AYRWDCSDYEDTEERPRPQFAASAPMTAP 477 

I I I I I :: I I I I : I I I I : :: I 
LWAALFLQFWKRKNAALLASQGYRFLGMEWSSL PFPKELIKNLG 181 



Qy 


422 


Db 


135 


Qy 


478 


Db 


182 


Qy 


533 


Db 


235 


Qy 


593 


Db 


264 


Qy 


650 


Db 


309 


Qy 


'697 


Db 


369 


Qy 


756 


Db 


427 


Qy 


816 



;EDEPY-----FPERSRARRMIJVGSWIVVMVAVVVMCLVSIILYRAIMAIVVSRS 532 
III I I I I ::|::::: : I : I I |: I 

CEKEAYQRYEWFAYRKRFRN DVLVIMSI ICLQLPFELAYAHIFEIITSD- 234 



: : I : I I : : III : :| : 

- 1 1 KYVLTAIYLL IIQYLTR LGGKVS VKL I 263 



III II : | | |: :: 

•YHVLLH-RN- FMT L RQVL I QRL I 308 



: PKLKGWWQKFRL RSKKR-KAGA SAGAS Q — GPWEDDY 696 

IN : : I : I I : I I : : I : I I I I : I 



II I : I I : I I I I : : I till I : : I : I I I : I I : III: 



I III:! I : :: : | : | | 
AAT IGAWLN IWQFLWMSI CTNSALL 455 



II I : I : : I I ::: I I 
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1 X 



Db 456 VCLY DQEGKWK IEPGLAAI LIMEHVLLLLKFGLSRLVPEE 495 

Qy 87 6 PESVEI-KVK REYYLAKQALAENEVLFGTNGTKDEQPKGSELSSHWTPFTVPKAS 929 

I I : I I : : I I I I : I : I I : 

Db 4 96 PAWVRASRVKNVTQAQDMY-CKQLL RSISGEFNSLTKPEQE 535 

Qy 930 QLQ 932 

I I 

Db 536 QQQ 538 



RESULT 10 

US-10-953-34 9-5636 

; Sequence 5636, Application US/10953349 
; Publication No. US2006010734 5A1 
; GENERAL INFORMATION: 

; APPLICANT: ALEXANDROV, Nickolai et al. 

; TITLE OF INVENTION: SEQUENCE-DETERMINED DNA FRAGMENTS AND CORRESPONDING POLYPEPTIDES 

; TITLE OF INVENTION: ENCONDED THERBY 

; FILE REFERENCE: 2750-1579PUS2 

; CURRENT APPLICATION NUMBER: US/ 10/953 , 34 9 

; CURRENT FILING DATE: 2004-09-30 

; NUMBER OF SEQ ID NOS: 40252 

SOFTWARE: Pa ten tin version 3.3 
; SEQ ID NO 5636 

LENGTH: 558 

TYPE: PRT 

ORGANISM: Arabidopsis thaliana 
US-10-953-34 9-5636 



Query Match 5.7%; Score 283.5; DB 6; Length 558; 

Best Local Similarity 23.4%; Pred. No. 4.7e-17; 

Matches 127; Conservative 78; Mismatches 167; Indels 171; Gaps 22; 



Qy 


422 


GT VFFS LFMALWAVLLLE YWKRKSAT L AYRWDCSDYEDTEERPRPQFAASAPMTAP 

i 1 : 1 1 1 1 1 :: 1 1 I 1 : 1 1 1 1 : : : I" 

GTI LWAALFLQFWKRKNAALLASQGYRFLGMEWSSL PFPKELIKNLG 


477 


Db 


150 


196 


Qy 


478 


NPITGEDEPY FPERSRARP^LAGS WIVVMVAWVMCLVS 1 1 LYRAIMAI WSRS 


532 


Db 


197 


1 III 1 1 1 1 ::|::::: : I : I I |: I 
NERAKEKEAYQRYEWFAYRKRFRN DVLVIMSIICLQLPFE LA YAH IFEIITSD- 


249 


Qy 


533 


GNTLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVF 


592 


Db 


250 


: :|: ||: : III ::|: 
U KYVLTAI YLL 1 1 QYLT R LGGKVS VKLI 


278 


Qy 


593 


1 FQFVN FY S S PVY I AF FKGR FVGY PGN Y HT L FGVRN EECAAGGC L I ELAQEL LV IMV 

1 : 1 : : : : : 1 1 III II : 1 1 1 : : : 
NREINESVEYRANSLIYKTYIGIF YHVLLH-RN FMTLRQVLIQRLI 


64 9 


Db 


279 


32 3 


Qy 


650 


GKQVINNMQEVLIPKLKGWWQKFRLRSKKR-KAGASAGASQ— GPWEDDY 


696 


Db 


324 


II : : : 1 1 1 : : 1 : 1 1 : 1 1 : : 1 : 1 1 1 1 : 1 
ISQVFWTLMDGSLPYLKYSYRKYRARTKKKMEDGSSTGKIQIASRVEKEYFKPTYSASIG 


383 


Qy 

Db 


697 
384 


-ELVPCEGLFDEYLEMVLQFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPV 755 

II : 1 1 1 1 : II : 1 M 1 : : 1 1 1 1 1 I : : I : I I I : 1 1 : III: 
VELE — DGLFDDSLELALQFGMIMMFACAFPLAFALAAVSNVMEIRTNALKLLVTLRRPL 441 


Qy 


756 


AERAQDIGIWFHILAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARA 

1 1 1 1 : 1 1 : : : : 1 : 1 1 
PRAAAT IGAWLN IWQFLWMS ICTNSALL 


815 


Db 


442 


470 


Qy 


816 


PSSFAAAHNRTCRYRAFRDDDGHYSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDI 

II 1 : 1 : : 1 1 :: : 1 1 1 : : 'l III: 
VCLY DQEGKWK IEPGLAAI LIMEHVLLLLKFGLSRLVPEE 


875 


Db 


471 


510 


Qy 


876 


PESVEI-KVK RE Y Y LA KQ AL AE N E VL FGTN GTKDEQPKGSELSS HWT P FT VP KA S 

11:11 : : | i | i : I : I I : 
PAWVRASRVKNVTQAQDMY-CKQLL RSISGEFNSLTKPEQE 


929 


Db 


511 


550 


Qy 


930 


QLQ 932 




Db 


551 


1 1 

QQQ 553 





RESULT 11 
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US-10-196-749-266 

Sequence 266, Application US/10196749 
Publication No. US20060094864A1 
GENERAL INFORMATION : 
APPLICANT: Baker, Kevin P. 
APPLICANT : Chen, Jian 
APPLICANT: Desnoyers , Luc 
APPLICANT: Goddard, Audrey 
APPLICANT: Godowski, Paul J. 
APPLICANT: Gurney, Austin L. 
APPLICANT : Pan, James 
APPLICANT : Smith, Victoria 
APPLICANT: Watanabe, Colin K. 
APPLICANT: Wood, William I. 
APPLICANT: Zhang, Zemin 

TITLE OF INVENTION: SECRETED AND TRANSMEMBRANE POLYPEPTIDES AND NUCLEIC 
TITLE OF INVENTION: ACIDS ENCODING THE SAME 
FILE REFERENCE: P3430R1C340 
CURRENT APPLICATION NUMBER: US/ 10/196, 74 9 
CURRENT FILING DATE: 2002-07-16 
PRIOR APPLICATION NUMBER: 10/052586 
PRIOR FILING DATE: 2002-01-15 
PRIOR APPLICATION NUMBER: 60/059263 
PRIOR FILING DATE: 1997-09-18 
PRIOR APPLICATION NUMBER: 60/059266 
PRIOR. FILING DATE: 1997-09-18 
PRIOR APPLICATION NUMBER: 60/062250 
PRIOR FILING DATE: 1997-10-17 
PRIOR APPLICATION NUMBER: 60/063120 
PRIOR FILING DATE: 1997-10-24 
PRIOR APPLICATION NUMBER: 60/063121 
PRIOR FILING DATE: 1997-10-24 
PRIOR APPLICATION NUMBER: 60/0634 86 
PRIOR FILING DATE: 1997-10-21 
PRIOR APPLICATION NUMBER: 60/063540 
PRIOR FILING DATE: 1997-10-28 
PRIOR APPLICATION NUMBER: 60/063541 
PRIOR FILING DATE: 1997-10-28 
PRIOR APPLICATION NUMBER: 60/063544 
PRIOR FILING DATE: 1997-10-28 

Prior Application data removed - See File Wrapper or PALM. 
NUMBER OF SEQ ID NOS: 612 
SEQ ID NO 266 
LENGTH: 1089 
TYPE: PRT 

ORGANISM: Homo Sapien 
US-10-196-749-266 

Query Match 2.1%; Score 105; DB 6; Length 1089; 

Best Local Similarity 18.2%; Pred. No. 1.8; 

Matches 108; , Conservative 79; Mismatches 174; Indels 234; Gaps 27; 

WARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVF 376 

III : I I : I : I I : I I I : I I :: 



Qy 


320 


Db 


527 


Qy 


377 


Db. 


587 


Qy 


396 


Db 


647 


Qy 


456 


Db 


695 


Qy 


509 


Db 


748 


Qy 


538 


Db 


808 



-SDIPTQELCGSKDS FEMC P 395 

: I I : : III 



III I : : : I I : : I I I : I I I : I 
-ISSP-WLSP L ASMVGGRAKN L WY GACV — AAL VAL LAAVRL — WL RR YGN L 694 

lDTEERPRPQFAASAPMTAPNP ITGEDEPYFPERSRARRMLAG-SWIWMV 508 

: I I 1:1 : I I I III : : : I I : I : I 

-KSPEPPMLFVRWGLPLMALGTAAYWALASGADEA — PPRLRV — LVSGASMVLPRAV 74 7 

/VMCLVSIILYRAIMAIWSRSG NTLL 537 

: ::::|::: :| : :| |:| 



-AAW ASRIASLT GSWNLVFILILSKIY 564 

II: I : : : I I : : I I I : I : : : 
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1 k 



Qy 565 VSLAHVL TRWEMHRTQTKFEDAFTLKVFI FQFVNFYSS 602 

: I I : I : | : | I I III: 
Db 868 L - L L H L LAAG I P VTT P G P FT VP WQAVS AWALMAT QT FYSTGHQ 909 

Qy 603 PVYIAF-FKGRFVGYPGNY HTLFGVRNEECAAGGCLI ELAQE 64 3 

I I : I : I I I : I : I I I I I I : I 

Db 910 PVFPAI HWHAAFVGFPEGHGSCTWLPALLVGANT FASHLLFAV GCPLLLLWP 961 

Qy 64 4 LLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPC — 701 

I I | | : : : I I I | j : : | : 

Db 962 FLCESQG LRKRQQPPGNEADARVRPEEEEEPLMEMRL 998 

Qy 702 EGLFDEYLEMVLQFGFVT- I FVAAC PLAPLFALLNNWVEI -RLDARKFVCE 750 

: : |:: |:: |: I : II II ::| : : :: I ||: I 
Db 999 RDAPQHFYAALLQLGLKYLFILGIQILACALAA— SILRRHLMVWKVFAPKFIFE 1051 



RESULT 12 

US-10-449-9O2-49799 

; Sequence 49799, Application US/10449902 
; Publication No. US20060123505A1 
; GENERAL INFORMATION : 

APPLICANT: National Institute of Agrobiological Sciences. 
; APPLICANT: Bio-oriented Technology Research Advancement Institution. 
; APPLICANT: The Institute of Physical and Chemical Research. 
; APPLICANT: Foundation for Advancement of International Science. 
; TITLE OF INVENTION: FULL-LENGTH PLANT cDNA AND USES THEREOF 
; FILE REFERENCE: MOA-A0205Y1-US 
; CURRENT APPLICATION NUMBER: US/10/449,902 
; CURRENT FILING DATE: 2003-05-29 
; PRIOR APPLICATION NUMBER: JP 2002-203269 
; PRIOR FILING DATE: 2002-05-30 
; PRIOR APPLICATION NUMBER: JP 2002-383870 
; PRIOR FILING DATE: 2002-12-11 
; NUMBER OF SEQ ID- NOS: 567 91 
; SOFTWARE: Patentln Ver. 2.1 
; SEQ ID NO 49799 

LENGTH: 697 

TYPE: PRT 

ORGANISM: Oryza sativa 
US-lO-449-902-49799 

Query Match 2.1%; Score 104; DB 6; Length 697; 

Best Local Similarity 18.5%; Pred. No. 1.2; 

Matches 105; Conservative 78; Mismatches 162; Indels 224; Gaps 25; 

Qy 14 0 WRETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQA 199 

I I : :: : :: I :| I |:| || :| :: : || ::: : 

Db 7 3 WTLTLIGWKYICIALNADDHGEGGTFAMYSL LCQHA-NIGI -LPSKKIYTEE 123 

Qy 200 SNWSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQIL 259 

I 1:11 I : I I I : I I 
Db 124 EN LISNQPWAG RPGRLRRFIESS IIA 150 

Qy 260 FE I L AKT P YGH E KKN L LG I H QL LAEGVL S AAF P L H D GPF KTPPEGPQAPR 309 

: I I : I I : I : : I : I : ! : ill I III 

Db 151- RRLLLLTA ILGMCMLIGDGILTPAISVLSAIDGLRGPFPSVSKPAVEGLSAAI 203 

Qy 310 LNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYF AW LGFYTGW — 356 

I I : : : I I I : I I i : I I : I 

Db 204 L VGLFLLQKYGTS KVS FMFSP IMAAWT FAT PVI GVYS I WRY 244 

Qy 357 LLPAAW GTLVFLVGCFLVFSDI 379 

: I : I I I : : : I :| :| : 

Db 24 5 YPGIFKAMSPHYIVRFFMTNQTRGWQLLGGTVLCITGAEAMFADLGHFSKRSIQIAFMSS 304 

Qy 380 PTQELCGSKDSF EMCPLCLDCPFWLLSSACALAQAGRLFDHGGT 4 23 

I I : I I : I : I : : : : : I : : 
Db 305 IYPSLVLTYAGQTAYLINNVDDFSDGFYKFVPRPVYWPMFIIATLAAIVASQ 356 

Qy 424 VFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGE 483 

II I ::|: ::| I I II : :: I 
Db 357 — SLISATFSVI— ' KQSWLDY FPRVKWHTSK DKE 388 

Qy 484 DEPYFPERSRARRMLAGSWI WMVAVWMCLVSI ILYRAIMAIWSRSG 533 

Mil: : t I : : I |::: I:: II :| |: 



http://es/ScoreAccessWeb/GetItem.action?AppId=10552515&seqId=775628&Item 11/17/2006 



1 A 



Db 389 GEVYSPETNYMLMLLCVGVI LGFGDGKDIGNAFGVWI LVML ITT I LLTLVMLI I 443 

Qy 534 NTLLAAWASRIASLTGSWNLVFILI LSKI YVS LAHVLTRW 574 

I : : : : I I I : I III . I I I : I 

Db 44 4 WGTH WLV ALYLVPFLLLEATYVS AVCT KI LRGGWVPFAVSVALAAVMFGW 494 

Qy 57 5 EMHRTQTKFEDAFTLKVFIFQFVNFYSSP 603 

I I I I II:: II 

Db 4 95 YYGR-QRKTEYEAANKVTLERLGELLSGP 522 



RESULT 13 

US-11-056-355B-8075 

; Sequence 8075, Application US/11056355B 
; Publication No. US20060150283A1 
; GENERAL INFORMATION: 

APPLICANT: Brover, Vyacheslav 
APPLICANT: Alexandrov, Nickolai 
; TITLE OF INVENTION: Sequence Determined DNA Fragments and Corresponding 
; TITLE OF INVENTION: Polypeptides Encoded Thereby 
; FILE REFERENCE: 2750-1590PUS2 
; CURRENT APPLICATION NUMBER: US/ 11/056 , 355B 
; CURRENT FILING DATE: 2005-02-14 
; PRIOR APPLICATION NUMBER: 60/544,190 
; PRIOR FILING DATE: 2004-02-13 
; NUMBER OF SEQ ID NOS : 119966 
; SEQ ID NO 8075 

LENGTH: 442 

TYPE: prt 
; ORGANISM: Zea mays subsp. mays 

FEATURE: 

NAME /KEY : peptide 
LOCATION: (1) . . (442) 

OTHER INFORMATION: Ceres Seq. ID no. 12384177 
US-11-056-355B-8075 



Query Match 2.1%; Score 102; DB 7; Length 442;* 

Best Local Similarity 18.6%; Pred. No. 0.94; 

Matches 70; Conservative 50; Mismatches 109; Indels 148; Gaps 14; 



Qy 


441 


WKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAG 
1 : 1 1 1 I III 1 : 1 1 1 1 1 1 1 1 
WRRLGATNA PRPPRRRPPPLTAPG RWRS PRRRRRVG 


500 


Db 


113 


148 


Qy 


501 


SV VIWMVAVWMCLVSIILYRAIMAIVVSRSGNT-LLAAWASRIA 

: : 1 : : ::: 1 : : : : 1 : : 1 1 :|| : 1 1 1 
APRSGSTSTWWALNVIFNIYNKKVLNAFPYPWLTSTLSLAAGSAIMLASWATRIAEAPQT 


545 


Db 


149 


208 


Qy 


546 


SLTGSWNLVFILILSKI YVSLAHVLTRWEMHRTQTKFEDAFTLKVFIF 

:: :: :: : ::|: II |:: 1 1 I:: 1 1- 
DLDFWKALTPVAIAHT I GHVAAT VSMAKVAVS FT H 1 1 KSGEPAFSVLVSRF 


594 


Db 


209 


259 


Qy 


595 


QFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVG — KQ 

: : 1 1 1 : III: 1 1 1 1 1 
FLGEHFPAPVYFSLLP 1 1 GGCALAAVTELN FNMVGFMGA 


652 


Db 


260 


298 


Qy 


653 


VI NNMQEVL I PKLKGWWQKFRLRS KKRKAGAS AGAS QGP 


691 


Db 


299 


: 1 : 1 : 1 : : I I 1 | | 1 1 : II 
MISNLAFW RT I FSKKGMKGKSVSGMN YYACLS IMSLVI LLPFAVAMEGP 


348 


Qy 


692 


— WEDDYEL VPCEGLF DEYLEMVLQFG 

1 :: | : :| 1 1 : 1 
KVWAAGWQTAVAEI GPNFVWWVAAQSVFYHLYNQVS YMSLDE I S PLTFS I GNTMKRI SVI 


716 


Db 


349 


408 


Qy 


717 


FVTIFVAACPLAPLFAL 733 




Db 


409 


: 1 : I : 1 : 1 1 
VASIIIFQTPVQPINAL 425 




RESULT 14 
US-10-449 


-902- 


-43353 





Sequence 43353, Application US/10449902 
Publication No. US20060123505A1 
GENERAL INFORMATION: 

APPLICANT: National Institute of Agrobiological Sciences. 

APPLICANT: Bio-oriented Technology Research Advancement Institution. 
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A 1 



; APPLICANT: The Institute of Physical and Chemical Research. 

; APPLICANT: Foundation for Advancement of International Science. 

; TITLE OF INVENTION: FULL-LENGTH PLANT cDNA AND USES THEREOF 

; FILE REFERENCE: MOA-A0205Y1 -US 

; CURRENT APPLICATION NUMBER: US/ 10/4 4 9, 902 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: JP 2002-203269 

; PRIOR FILING DATE: 2002-05-30 

; PRIOR APPLICATION NUMBER: JP 2002-383870 

; PRIOR FILING DATE: 2002-12-11 

; NUMBER OF SEQ ID NOS: 56791 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 43353 

; . LENGTH: 52 9 
TYPE: PRT 
. ; ORGANISM: Oryza s'ativa 
US-10-449-902-43353 

Query Match 2.0%; Score 101; DB 6; Length 529; 

Best Local Similarity 20.6%; Pred. No. 1.5; 

Matches 71; Conservative 53; Mismatches 137; Indels 84; Gaps 16; 



Qy 


463 


RPRP QFAASAPMTAP NPITGEDE — -PYFPERSRARRMLAGS 

MM : 1 : M : 1 1 : 1 : 1 1 1 1 : : I : 
RPRPPWCRFSASSPPPPPDDPDDYELLDTTGNCDPLCSVDEVSSQYFEANYKPKNDLLKA 


501 


Db 


50 


109 


Qy 


502 


WIWMVAV WMCLVSIILYRAIM AIWSRSGNTLLAA W 

: 1 : : 1 M : 1 I : : : : : M M 1 1 
LT 1 1 AT AL AGAAA I N H S WVA E H QD I AMV L V FALG YAG IIFEESLAFNKS GVGLLMAVC LW 


540 


Db 


110 


169 


Qy 


541 


ASR IASLTGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAF 

1 :: | | :M 1: : 1 : :: : : 
VI RS IGAPSTDVAVQELSHTTAEVSEIVFFLLGAMT IVEI VDAHQGFKLVTDNI STRN PR 


587 


Db 


170 


229 


Qy 


588 


TLKVFI FQFVNFY SSPVYIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGC- 

M : : : M 1 : 1 : 1 : : 1 : 1 1 1 1 1 Ml 
TL-LWVIGFVTFFLSSILDNLTSTIVMVSLL— RKLVPPSEYRKLLGAVWISANAGGAW 


636 


Db 


230 


286 


Qy 


637 


— LIELAQELLVIMVGKQVINNMQEVLIPKLKGWWQKFRLRSKKRKAGASAGASQGPWED 

: : : Ml M M : M : II M 1 : 1 
TPIGDVTTTMLWIHGQITTLNTMQGLFLPSWSLAVPLALMSLTSEANGSSQKSSSLLSS 


694 


Db 


287 


346 


Qy 


695 


DYELVPCEGLFDEYLEMVLQFG FVT I FVAAC P LA PL FALL 7 34 

: : : 1. 1 : M 1 MM 1 II : : . 
E-QMAP-RG QLVFAVGLGALVFVPVFKALTGLPPFMGMM 383 




Db 


347 




RESULT 15 
US-10-449- 


-902- 


-52268 





Sequence 52268, Application US/10449902 
Publication No. US20060123505A1 



; GENERAL INFORMATION: 

; APPLICANT: National Institute of Agrobiological Sciences. 

; APPLICANT: Bio-oriented Technology Research Advancement Institution. 

; APPLICANT: The Institute of Physical and Chemical Research. 

; APPLICANT: Foundation' for Advancement of International Science. 

; TITLE OF INVENTION: FULL-LENGTH PLANT cDNA AND USES THEREOF 

; FILE REFERENCE: MOA-A0205Y1-US 

; CURRENT APPLICATION NUMBER: US/ 10/4 4 9 , 902 

; CURRENT FILING DATE: 2003-05-29 

; PRIOR APPLICATION NUMBER: JP 2002-203269 

; PRIOR FILING DATE: 2002-05-30 

; PRIOR APPLICATION NUMBER: JP 2002-383870 

; PRIOR FILING DATE: 2002-12-11 

; NUMBER OF SEQ ID NOS: 56791 

; SOFTWARE: Patentln Ver. 2.1 

; SEQ ID NO 52268 

LENGTH: 532 

TYPE: PRT 

ORGANISM: Oryza sativa 
US-10-449-902-52268 

Query Match 2.0%; Score 101; DB 6; Length 532; 

Best Local Similarity 20.6%; Pred. No. 1.5; 

Matches 71; Conservative 53; Mismatches 137; Indels 84; Gaps 16; 
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Qy 


463 


Db 


50 


Qy 


502 


Db 


110 


Qy 


541 


Db 


170 


Qv 


588 


Db 


230 


Ov 


637 


Db 


287 


Qy 


695 


Db 


•347 



IPRP QFAASAPMTAP NPITGEDE PYFPERSRARRMLAGS 501 

Mi :|:M:| I . :|: II II : : I : 



-WMCLVSI ILYRAIM AI WSRSGNTLLAA W 54 0 

: I II : I I: :: ::|| III I 



I I : I I 



2FVNFY SSPVYIAFFKGRFVGYPGNYHTLFG-VRNEECAAGGC- 636 

III: I : I :: I : I I I I I III 



-LI ELAQELLVIMVGKQVINNMQEVLI PKLKGWWQKFRLRSKKRKAGASAGASQGPWED 694 
: :: : I I : I I I : : I : . II : I I : I 



fPCEGL FDEYLEMVLQFG FVT I FVAACPLAPL FALL 734 

II : : I I 11:11 II : : 

VP-RG QLVFAVGLGALVFVPVFKALTGLPPFMGMM 383 



Search completed: October 27, 2006, 20:34:47 
Job time : 53 sees 
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Score Home Page Retrieve Application List SCORE System Overview SCORE FAQ Comments / Sug g 

This page gives you Search Results detail for the Application 10552515 and Search Result us-10-55 
start 



GenCore version 5.1.9 
Copyright (c) 1993 - 2006 Bioccelera tion Ltd. 



OM protein - protein search, using sw model 

Run on: October 27, 2006, 20:23:48 ; Search time 48 Seconds 

(without alignments) 
1870.213 Million cell updates/sec 

Title: US-10-552-515-1 
Perfect score: 4950 

Sequence: 1 MRMAATAWAGLQGPPLPTLC SELSSHWTPFTVPKASQLQQ 933 



Scoring table: 



Searched : 



BLOSUM62 

Gapop 10.0 , Gapext 0.5 
283416 seqs, 96216763 residues 



Total number of hits satisfying chosen parameters: 

Minimum DB seq length: 0 

Maximum DB seq length: 2000000000 

Post-processing: Minimum Match 0% 

Maximum Match 100% 
Listing first 45 summaries 



283416 



Database : 



PIR 80:* 



1: 


pirl 




2: 


pir2 


+ 


3: 


pir3 




4: 


pir4 





Pred. No. is the number of results predicted by chance to have a 
score greater than or equal to the score of the result being printed, 
and is derived by analysis of the total score distribution. 



SUMMARIES 



% 

Result Query 

No. Score Match Length DB ID 



Description 



1 


7 34 


14. 


8 


1049 


2 


T22762 


hypothetical prote 


2 


288.5 


5. 


8 


572 


2 


F96755 


hypothetical prote 


3 


181.5 


3. 


7 


946 


2 


S48255 


probable membrane 


4 


117 


2. 


4 


548 


2 


148693 


natural resistance 


5 


115.5 


2. 


3 


3010 


1 - 


GNWVTC 


genome polyprotein 


6 


110.5 


2. 


2 


680 


2 


T35404 


probable squalene- 


7 


110.5 


2. 


2 


873 


2 


S46584 


probable membrane 


8 


110 


2. 


2 


792 


2 


T00487 


probable potassium 


9 


108 


2. 


2 


3010 


1 


A45573 


genome polyprotein 


10 


106.5 


2. 


2 


519 


2 


T11129 


cytochrome-c oxida 


11 


106 


2. 


1 


4 38 


2 


B86088 


probable citrate p 


12 


106 


2. 


1 


4 38 


2 


E91240 


probable membrane 


13 


105 


2. 


1 


621 


2 


JC134 6 


dopamine beta -mono 


14 


104 


2. 


1 


646 


2 


H82555 


c-type cytochrome 


15 


103.5 


2. 


1 


478 


2 


JQ2034 


RNA-directed RNA p 


16 


102 


2. 


1 


302 


2 


C83993 


hypothetical prote 


17 


101.5 


2. 


1 


395 


2 


D81040 


cytochrome c-type 


18 


101.5 


2. 


1 


395 


2 


B81986 


probable membrane . 


19 


101 


2. 


0 


466 


2 


A95355 


probable inner-mem 


20 


100.5 


2. 


0 


585 


2 


S74673 


pleD protein - Syn 
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21 


100 


2.0 


515 


2 


D71390 


cytochrome-c oxida 


22 


100 


2.0 


3010 


1 


GNWVCJ 


genome polyprotein 


23 . 


99.5 


2.0 


7 37 


2 


AG2156 


hypothetical prote 


24 


98.5 


2.0 


413 


2 


AF0393 


NADH2 dehydrogenas 


25 


98.5 


2.0 


1353 


2 


T26301 


hypothetical prote 


26 


98.5 


2.0 


1755 


2 


S69845 


TyB protein - yeas 


27 


98 


2.0 


348 


2 


T12280 


NADH2 dehydrogenas 


28 


98 


2.0 


1265 


2 


T51314 


probable CO-induce 


29 


97.5 


2.0 


348 


2 


T12291 


NADH2 dehydrogenas 


30 


97.5 


2.0 


348 


2 


T12290 


NADH2 dehydrogenas 


31 


97.5 


2.0 


4 60 


2 


A84154 


amino acid transpo 


32 


97.5 


2.0 


906 


2 


G83156 


probable transcrip 


33 


96.5 


1.9 


348 


2 


T12281 


NADH2 dehydrogenas 


34 


96.5 


1.9 


417 


2 


C81084 


probable integral 


35 


96.5 


1.9 


491 


2 


B70414 


NADH2 dehydrogenas 


36 


96.5 


1.9 


1755 


2 


S69969 


TyB protein - yeas 


37 


96 


1.9 


419 


1 


SYPJCD 


naringenin-chalcon 


38 


96 


1.9 


428 


2 


T48284 


hypothetical prote 


39 


96 


1.9 


865 


2 


T40288 


hypothetical prote 


40 


95.5 


1.9 


572 


2 


T48601 


hypothetical prote 


41 


95.5 


1.9 


758 


2 


D71072 


hypothetical prote 


42 


95.5 


1.9 


1755 


2 


S50663 


TyB protein - yeas 


43 


95 


1.9 


429 


2 


AG3150 


hypothetical prote 


44 


95 


1.9 


473 


2 


AC047 9 


glycerol- 3-phospha 


45 


95 


1.9. 


631 


2 


B98137 


hypothetical 46. IK 



ALIGNMENTS 



RESULT 1 
T22762 

hypothetical protein F56A8.1 - Caenorhabditis elegans 
C; Species: Caenorhabditis elegans 

C;Date: 15-Oct-1999 #sequence_revision 15-Oct-1999 #text_change 09-Jul-2004 
C;Accession: T22762 
R;McMurray, A. 

submitted to the EMBL Data Library, December 1996 
A; Reference number: Z19612 
A;Accession: T22762 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A; Molecule type: DNA 
A; Residues: 1-1049 

A; Cross-references : UNIPROT: 04 5572 ; UNIPARC: UPI000007F44C; EMBL:Z83230; PIDN : CAB0574 1 . 1 ; GSPDB : GN00021 ; CES 

A; Experimental source: clone F56A8 

C; Genetics: 

A; Gene: CESP:F56A8.1 

A; Map position: 3 

A;Introns: 86/3; 146/3; 208/3; 245/2; 295/2; 325/2; 397/3; 532/3; 582/2; 612/3; 654/1; 677/1; 707/3; 734/3; 
C; Superfamily: Caenorhabditis elegans hypothetical protein F56A8.1 



Query Match 14.8%; Score 734; DB 2; Length 1049; 

Best Local Similarity 27.0%; Pred. No. 9.2e-54; 

Matches 248; Conservative 137; Mismatches 309; Indels 224; Gaps 32; 



Qy 


96 


AAACRAGSPAKP-RIA-DFVLVWEEDLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGL 
II 1111:11111 : 1 1 1 : : 1 1 1 :: I I 
AATT EVDYPYFPFRISIDFVLV HNAAESRS - - KGKYRE FFEKAVQKEGL 


153 


Db 


3 


49 


Qy 


154 


CVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLGIP 
: 1 1 1 1 1 : 1 : 1 : 1 1 1 : : 1 |: : : I : I 
IIRHQ — QSGQT — HFTLISTPFHRLTREAEMSQMCFPLKDCQVKP GLP 


213 


Db 


50 


94 


Qy 


214 


NVLLEV VPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYG 

: : : II : : 1 : II : : : 1 1 II : : 1 : :: 1 1 1 
SCCIPLSQIFVTDDTVRFINAPFQRKHGSLFLNYHDEKSFFTSSQRGYLTYQILTKIDIS 


269 


Db 


95 


154 


Qy 


270 


HEKK NLLGIHQLLAEGVLSAAFPLH 

: 1 1:1111111 
KDLKGERLGESQDEPTDPSTSSITSDEQLRRKGLSWLLMSDVYEEAFVLHAPSKEEPYFK 


294 


Db 


155 


214 


Qy 


295 


DGPFKTPPEGPQAPRLNQRQVLFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLG 

:| 1 1 1: 1: 1 :| 1 1 1 : 1 1 1 : : 1 I 1 I I : : I I I I I I 
AMQNGSVKAYNEFISEIELDPRRSLSLNWER WYKFQPLNKIRDYFGEQIAYYFAWQG 


351 


Db 


215 


271 


Qy 


• 352 


FYTGWL L P AA WGT LV FL VGC FL V FS D I P TQEL-CGSKDS FE 


392 
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: I II : I :M: I II |: : II :: | 

Db 272 T FLT LL WP AV I FGL W F I YG F I D S I S S A PL DWN H CKWN F I GQT EN VACGMRNGVT L F FS 331 

Qy 393 MCPLCLDCP FWL L S S AC ALAQAGRL FDH GGT V F F S L FMAL WA VL L L E YWKRKS AT LAY RW 452 

I I :M II ll::||::| : :: III :: |:|:| 

Db 332 MVTQ WFMSS FDTKMNAFFAVFMS IWGSVFVQIWKRNNSVLSYQW 375 

Qy 4 53 DCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIWMVAVW 512 

: I: I I I • I I r t I I I I I I : I : I I I I : : III 

Db 376 NSDDFHAI EP-DRPEFRGS — KVKEDPITGEDIWI SPALARYI KMLASFVFVSFSMLVW 432 

Qy 513 MCLVSI ILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLT 572 

: I : : I : I : III: I :: I : : I I I : I I 

Db 433 ISLMLVTLLKIWMVYN FQCTKEYTFHCWLS — AAFLPSILNTLSAMGLGAIYSNLVSRLN 490 

Qy 573 RWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNE 629 

Mill::: : : : I : I I I I I I : I | : | | : | | 111= | 
Db 491 SWENHRTESEHNNSLIVKIFAFQMVNTYTSLFYVAFIRPESHGLQPN — GLFGLGTEFKD 548 

Qy 630 EC AAGGCL I E LAQE LL VI MVGKQV I N NMQE VL IP KLKGWWQK FRLRS KKRKAGA SAG 686 

I I -11:11 : I ::|::l II Mil : I 

Db 54 9 TCLDDTCSSLLALQLLTHTLIKPFPKFFKDWLPYFVKL FRLRMYTSRTEARVE 602 

Qy 687 ASQGPWEDDYELVPCEGLFDEYLEMVLQFGFVTI FVAACPLAPLFALLNNWVEIRLDARK 746 

III: I : I : I I I I I I I : : I :: I : I I : 

Db 603 I EDDDQ AN VLMFAS L F P L A P LL AL 1 1 G FVDMR I D AH R 639 

Qy 74 7 FVCEYRRPVAERAQDI GI WFH I LAGLTHLAVI SNAFLLAFSSDFLPRAYYRWTRAHDLRG 806 

: I : I : I I I I II I : I I : I I I : : I I : I I 
Db 640 LIWFNRKPIPIITNGIGIWLPILT FLQYCAVFTNAFIVAFTSGFC . 684 

Qy 807 FLNFTLARAPSSFAAAHNRTCRYRAFRDDDGHYSQTYWNLLAIRLAFVIVFEHWFSVGR 866 

I : I I ill I I II I I I I : : : II : 
Db 685 ST FLA DGAYC-TVQN RL 1 1 V I VFQN L V FGL K Y 715 

Qy 867 LLDLLVPD I PESVE IKVKREYY LAKQA- LAEN EVLFGT 903 

II ::| II !::: :::: I : I I I : I I : : : 

Db 716 LLSSVIPSIPASIKLALRKKRYWAHIVEKGDVPHRTRIKKRTRIAKLAWIASNQKMIK3 775 

Qy 904 NGTKDEQPKGSELS SHWT 921 

I I :: I I : I 

Db 776 NRKKEKSNK KHFT 788 



RESULT 2 
F967 55 

hypothetical protein F3N23.22 [imported] - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 02-Mar-2001 #sequence_revision 02-Mar-2001 #text_change 09-Jul-2004 
C;Accession: F96755 

R;Theologis, A.; Ecker, J.R.; Palm, C.J.; Federspiel, N.A.; Kaul, S.; White, 0.; Alonso, J.; Altaf, H.; Ara 
Nature 408, 816-820, 2000 

A; Authors: Hunter, J.L.; Jenkins, J.; Johnson-Hopson, C. ; Khan, S.; Khaykin, E.; Kim, C.J.; Koo, H.L.; Krem 
.A; Authors: Salzberg, S.L.; Schwartz, J.R.; Shinn, P.; Southwick, A.M.; Sun, H.; Tallon, L.J.; Tambunga, G. ; 
A; Title: Sequence and analysis of chromosome 1 of the plant Arabidopsis. 
A;Reference number: A86141; MUID: 21016719; PMID: 11130712 
A;Accession: F96755 
A; Status: preliminary 
A; Molecule type: DNA 
A; Residues: 1-572 

A; Cross-references: UNIPROT: Q9SSM5; UNIPARC:UPI00000A63BD; GB:AE005173; NID:g5903091; PIDN : AAD5564 9 . 1 ; GSPD 

C; Genetics: 

A; Gene: F3N2 3.22 

A; Map position: 1 

Query Match 5.8%; Score 288.5; DB 2; Length 57 2; 

Best Local Similarity 20.5%;. Pred. No. 3.4e-16; 

Matches 169; Conservative 97; Mismatches 236; Inde.ls 323; Gaps 29; 

Qy 14 2 ETFLDNLRAAGLCVDQQDVQDGNTTVHYALLSASWAVLCYYAEDLRLKLPLQELPNQASN 201 

I : II I: II: I: I |:: I : I I I : 

Db . 32 EVLVTELRKKGMWDR -WGLAHEFLKVAAPSEILGNAAAE 71 

Qy 202 WSAGLLAWLGIPNVLLEWPDVPPEYYSCRFRVNKLPRFLGS DNQDTFFTSTKRHQILFE 261 

III 1:11 : : II : I : : I I 

Db 72 LHIRKPTRLGI DLPFEMQGSEAFIRQPDGLLFS WFERFRCYQHLIY 117 
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Qy 262 ILAKTPYGHEKKNLLG IHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPR 309 

: : I I : I : : I : I I I : I I I I I 

Db 118 GIVNSG-GHDVTLKLDGREFCWTAGESLLRRLESEGVIKQMFPLHDE 163 

Qy 310 LNQRQVLFQHWA-RWGKWN-KYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLV 367 

: I : I I : I I I II I I : I : I I I I 

Db 164 -LKRKELLQNWALNW — WNCTNQPI DQI YSYFGAK 195 

Qy 368 FLVGCFLVFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDHGGTVFFS 4 27 

Db 196 ELIKNLGN 203 

Qy 428 LFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPY 4 87 

I I I I I I 

Db 204 ERAKEKEAYQRYEW 217 

Qy 488 FPERSRARRMLAGSWIVVMVAVVVMCLVSIILYRAIMAIVVSRSGNTLLAAWASRIASL 547 

I I I I :: |: :::: : | : I I I : I : I I 

Db. 218 FAYRKRFRN DVLVIMSI ICLQLPFELAYAHI FEI ITSDI IKYVLTA 2 63 

Qy 548 TGSWNLVFILILSKIYVSLAHVLTRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIA 607 

: I: I |: :.: : : I I :: : :: III I II 

Db 264 IYLLIIQYLTRLGGKVSVKLINREINESVEYRANSLIYKVF — - — GLYFMQTYIG 314 

Qy 608 FFKGRFVGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINNMQEVLIPKLKG 667 

I III II • : | | | : :: | | : : : | | | 

Db 315 IF YHVLLH-RN FMTLRQVLIQRLI I SQVFWTLMDGSLPYLKY 355 

Qy 668 WWQKFRLRSKKR-KAGASAGASQ — GPWEDDY ELVPCEGL FDEYLEMVL 713 

: : I : I I : I I : : I : I I I I : I II : I I I I : I I : I 

Db 356 SYRKYRARTKKKMEDGSSTGKIQIASRVEKEYFKPTYSASIGVELE — DGLFDDSLELAL 413 



Qy 714 QFGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLT 773 

I I I : : I I I I I I : : I : M I : I I : III: I I I I : I ! 

Db 414 Q FGM I MMF AC AF P L A F AL AAVS N VME I RT N AL KL L VT L RR P L P RAAAT I GAW LN I WQ F LV 473 

Qy 774 HLAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNETLARAPSSFAAAHNRTCRYRAFR 833 

::: :|: I I II 

Db 474 VMSICTNSALL VCLY 4 88 



Qy 834 DDDGHYSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDI PESVEI-KVK REY 887 

I : I : : I I : : : I I I : : I I I I : I I : I I : : 

Db 489 DQEGKWK IEPGLAAILIMEHVLLLLKFGLSRLVPEEPAWVRASRVKNVTQAQDM 542 

Qy 888 YLAKQALAENEVLFGTNGTKDEQPKGSELSSHWTPFTVPKASQLQ 932 

till :| : | |: | | 

Db 54 3 Y-CKQLL RSISGEFNSLTKPEQEQQQ 567 



RESULT 3 
S48255 

probable membrane protein YBR086c - yeast ( Saccharomyces cerevisiae) 
N; Alternate names: hypothetical protein YBR0809 
C; Species: Saccharomyces cerevisiae 

C;Date: 03-Aug-1995 #sequence_revision ll-Aug-1995 #text_change 09-Jul-2004 
C;Accession: S48255; S45954; S44670 

R;Mannhaupt, G.; Stucka, R.; Ehnle, S.; Vetter, I.; Feldmann, H. 
Yeast 10, 1363-1381, 1994 

A;Title: Analysis of a 70 kb region on the right arm of yeast chromosome II. 
A/Reference number: S48255; MUID : 95208357 ; PMID:7900426 
A/Accession: S48255 

A; Status: nucleic acid sequence not shown 
A; Molecule type: DNA 
A; Residues: 1-94 6 

A; Cross-references: UNIPROT : P38250; UNIPARC:UPI0000036C25; EMBL:X78993; NID:g476045; PIDN : CAA55593 . 1 ; PID:g 

R;Feldmann, H.; Mannhaupt, G.; Schwarzlose, C; Vetter, I. 

submitted to the Protein Sequence Database, August 1994 

A; Reference number: S4 5927 

A;Accession: S45954 

A; Molecule type: DNA 

A; Residues: 1-94 6 

A;Cross-references: UNIPARC:UPI0000036C25; EMBL:Z35955; NID:g536351; PID : g536352 ; MIPS:YBR086c 

C;Genetics : 

A; Gene: SGD:IST2 

A;Cross-references: SGD:S0000290 
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A; Map position: 2R 

C;Superfamily: Saccharomyces cerevisiae probable membrane protein YBR086c 
C; Keywords: transmembrane protein 

F; 131-147/Domain : transmembrane flstatus predicted 

F; 158-174/Domain : transmembrane flstatus predicted 

F; 207-24 3/Domain : transmembrane #status predicted 

F; 24 8 -27 4 /Domain : transmembrane #status predicted 

F; 302-324/Domain : transmembrane tfstatus predicted 

F; 450-477/Domain: transmembrane #status predicted 

F; 506-532/Domain : transmembrane istatus predicted 

F; 56 3-588/ Domain : transmembrane #status predicted 

Query Match ' 3.7%; Score 181.5; DB 2; Length 946; 

Best Local Similarity 18.6%; Pred. No. 8.6e-07; 

Matches 118; Conservative 99; ■ Mismatches 243; Indels 173; Gaps- 25; 

Qy 34 2 KVAL Y F AWLG FYTGWL L P AA WGT L V FL VGC FL V FSDIPTQELCGSKDSFEMCPLCLDCP 4 01 

I : I I I I : i I I I : I : I : : I 
Db 119 KQSLYFAFLQNYIKWLIPFSFFGLSIRFLSNF 150 

Qy 402 FWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRK SATLAYRWDCSD 4 56 

:: I II : : I I II: I 

Db 151 TYEFNST — YSLFAILWTLSFTAFWLYKYEPFWSDRLSKYSSFST 193 

Qy 4 57 YEDTEERPRPQFAASAPMTAPN PITGEDEPYFPERSRARRMLAGSWIWMVAVW 512 

I ::: : I I |: : |: :| |: :::: : 
Db 194 IEFLQDKQKAQKKASSVIMLKKCCFI PVA LLFGA ILLSFQL 234 

Qy 513 MCLVSI ILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVS-LAHVL 571 

I 11:1 : I : II : : : I : I : I | : 

Db 235 YC FALE I FYKQI Y NGPMI S I LS FL PT I L I CT FT P VLT V I YNKYFVE PM 282 

Qy 57 2 TRWEMHRTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGVRNEEC 631 

I : I I I : : : I I : I : : I I : i I : I : : | : 

Db 283 TKWENHSSWNAKKSKEAKNFVIIFLSSY-VPLLITL FLYLPMGHLLTAEIRTKVF 337 

Qy 632 AAGGCLIEL AQELLVIMVGKQVINNMQEVLI PKLKGWWQK 671 

III : :| 1:1 I :| I |: 

Db 338 NAFSILARLPTHDSDFIIDTKRYEDQFFYFIVINQLIQFSMENFVPSLVSIAQQKINGPN 397 

Qy 672 FRLRSKKRKAGASAGASQGPWE — DDYELVPCEGLFD EYLEMVLQFGFVTIFVA 723 

: I: II I: : I |: I || : :::||||:: :| 

Db 398 PNFVKAESEIGKAQLSS-SDMKIWSKVKSYQTDPWGATFDLDANFKKLLLQFGYLVMFST 4 56 

Qy 724 AC P L A P L F AL LN NWVE I RL D ARKFVC EY-RRPVAERAQD- IGIWFHILA 770 

I I I I |: I : ::| II I || |: :: :|:| :| 

Db 457 IWPLAPFICLIVNLIVYQVDLRKAVLYSKPEYFPFPIYDKPSSVSNTQKLTVGLWNSVLV 516 

Qy 771 GLTHL-AVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCRY 829 

: I II: : I :| : : | | 

Db 517 MFSILGCVITATLTYMYQSCNIP GVGAHTSIHTNKAWY 554 

Qy 830 RAFRDDDGHYSQTYWNLLAIRLAFVIVFEHWFSVGRLLDLLVPDIPESVEIKVKREYYL 889 

I : :: I : : : : I | | :: | :: : | : :: : 
Db 555 LA NPINHSWINI VLYAVFIEHVSVAIFFLFSSILKSSHDDVANGIVPKHW 605 

Qy 890 AKQALAENEVL - FGTN GT KD - EQ PKGS 914 

I : I I I : I I : I I I i 

Db 606 NVQN P PKQEVFE K I PS PE FN SN N E KE LVQRKG S 638 

RESULT 4 
148693 

natural • resistance-associated macrophage protein 1 - mouse 
N; Alternate names: transport system membrane protein Nramp 
C; Species: Mus musculus (house mouse) 

C;Date: 02-Jul-1996 #sequence_revision 02-Jul-1996 #text_change 09-Jul-2004 
C;Accession: 148693; A57071; A40739 

R;Barton, C.H.; White, J.K.; Roach, T.I.; Blackwell, J.M. 
J. Exp. Med. 179, 1683-1687, 1994 

A;Title: NH2-terminal sequence of macrophage-expressed natural resistance-associated macrophage protein (Nr 
A;Reference number: 148693; MUID: 94216838; PMID:7513015 
A;Accession: 148693 

A; Status: preliminary; translated from GB/EMBL/DDB J 
A; Molecule type : mRNA 
A; Residues: 1-548* 
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A; Cross-references: UNIPROT : P4 1251 ; UNIPARC: UPI000002770F; EMBL:X75355; NID:g505155; PIDN : CAA53102 . 1 ; PID:g 
R;Govoni, G. ; Vidal, S.; Cellier, M..; Lepage, P.; Malo, D.; Gros, P. 
Genomics 27, 9-19, 1995 

A;Title: Genomic structure, promoter sequence, and induction of expression of the mouse Nrampl gene in macr 
A;Reference number: A57071; MUID : 95394476; PMID:7665187 
A;Accession: A57071 

A; Status: preliminary; not compared with conceptual translation 
A; Molecule type: DNA 
A; Residues: 1-548 

A; Cross-references: UNIPARC :UPI000002770F 

R; Vidal, S.M.; Malo, D.; Vogan, K. ; Skamene, E. ; Gros, P. 

Cell 73, 469-485, 1993 

A;Title: Natural resistance to infection with intracellular parasites: isolation of a candidate for Beg. 

A;Reference number: A40739; MUID : 93258812; PMID:8490962 

A;Accession: A40739 

A; Status: preliminary 

A;Molecule type: nucleic acid 

A; Residues: 65-54 8 

A; Cross-references: UNIPARC : UPI0000178BFA 

A; Note: sequence extracted from NCBI backbone (NCBIN : 1 31 666, NCBIP : 131667 ) 
C; Superfamily : natural resistance-associated macrophage protein 1 



Query Match 2.4%; Score 117; DB 2; Length 548; 

Best Local Similarity 20.2%; Pred. No. 0.13; 

Matches. 136; Conservative 89; Mismatches 207; Indels 240; Gaps 34; 



Qy 


67 


VLIDVSPPEAEKRGSYGSTAHASEPG-GQQAAACRAGSPAKPRIADFVLVWEEDLKLDRQ 
:: 1 1 1 1 1 II 1 1 : 1 II 1 1 1 1 I "| 
MISDKSPPRL-SRPSYGSI — SSLPGPAPQPAPCR ETYLSEKIP 


125 


Db 


1 


41 


Qy 


126 


QDSAARDRTDMHRTWRET FLD — NLRAAGLCVDQQDVQDGNTTVHYALLSA 

II : : : 1 1 INI: : |:| 1 I 
I PSADQGT FS LRKLWAFTGPGFLMSIAFLD PGN I E S D L QAGA VAGFKL LWVL 


174 


•Db 


42 


93 


Qy 


175 


II : 1 1 | | : | : | : | | : 
LWATVLGLLCQRLAARLGWTGKDLGEVCHLYYPKVPRILLWLTIELAIVGSDMQEVIGT 


198 


Db 


94 


153 


Qy 


199 


ASNW SAGLLAWLGI PNVLLEWPDVPPE YYSCRFRVNKLPRFLGSDNQDT FFTSTKR 

1 : : III: 1 1 1 : : 1 1 : : : 1 1 I I' 


255 


Db 


154 


197 


Qy 


256 


HQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGPFKTPPEGPQAPRLNQRQV 

:| 1:1 1 :|:l : 1 ::| 1 * 1 : : 

— LL IT IMALT - FGYE YWAHP — SQGALLKGLVLPTCPGCGQPELLQAVGI VGAI I 


315 


Db 


198 


249 


Qy 


316 


LFQHWARWGKWNKYQPLDHVRRYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFL- 
: : 1 : :| II : :|| I: I : :: |:: f: 
MP HN I Y LH S ALVKS RE VD RT RRVD VREANMYF LIEATIALSVSFIINLFVM 


374 


Db 


250 


300 


Qy 


37 5 


-VFSDIPTQELCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFD HGG 

II 1: ::::! :| : :| ::| || 
AVFGQAFYQQT— NEEAFNIC ANSSLQNYAKIFPRDNNTVSVDIYQGG 


422 


Db 


301 


34 6 


Qy 


423 


TVFFSLF MALWAVLLLEYWKRKSATLAY RWDCSDYEDTEERPRP 

: II : : 1 1 1 II : : 1 I II 
VILGCLFGPAALYIWAVGLLAAGQSSTMTGTYAGQFVMEGFLKLRW 


466 


Db 


347 


392 


Qy 


467 


QFAASAPMTAPNPITGEDEPYFPERSR-ARRMLAGSWIV — VMVAV V 

1 1 1 1 : 1 1 1 : 1 : 1 1 1 : 
SRFARVLLTRSCAILPTVLVAVFRDLKDLSGLNDL 


511 


Db 


393 


427 


Qy 


512 


VMCLVS 1 1 LYRAIMAI WSRSGNTLLAAWAS-RI ASLTGS WNLVFILILSKI 


563 


Db 


428 


: 1 I:: I: 1 :: :|: I:: 1 :|| |:: 
LNVLQSLLLPFAVLPILTFTSMPAVMQEFANGRMSKAITSCIMALVCAINLYFVI SY 


484 


Qy 


564 


YVSLAHVLTRWEMHRTQTKFEDAFTLKVFI FQFVNFYSSPVYIAF FKGRFVGYPG 

II 1 | : | : | : : | : | : | : : 
LPSLPH PAYFGLVALFA-IGYLGLTAYLAWTCCIAHGATFLTHSS 


618 


Db 


485' 


528 


Qy 


619 


NYHTLFGVRNEE 630 
: 1 1:1: III 
HKHFLYGLPNEE 540 




Db 


529 





RESULT 5 



http://es/ScoreAccessWeb/GetItem.action?AppId=10552515&seqId=775629&ItemName... 11/17/2006 



GNWVTC 

genome polyprotein - hepatitis C virus 

N;Contains: capsid protein C; envelope protein M; hepacivirin (EC 3.4.21.98) (nonstructural protein NS3) ; m 
C; Species: hepatitis C virus 

C;Date: 31-Mar-1992 #sequence_revision 31-Mar-1992 #text_change 09-Jul-2004 
C;Accession: A38465 

R;Takamizawa, A.; Mori, C; Fuke, I.; Manabe, S.; Murakami, S.; Fujita, J.; Onishi, E. ; Andoh, T.; Yoshida, 
J. Virol. 65, 1105-1113, 1991 

A;Title: Structure and organization of the hepatitis C virus genome isolated from human carriers. 

A;Reference number: A38465; MUID : 91140698 ; PMID: 1847440' 

A;Accession: A38465 

A; Molecule type: genomic RNA 

A; Residues: 1-3010 

A/Cross-references: UNIPROT: P2 6663; UNIPARC : UPI00001 31E1C; EMBL:M58335; NID:g329770; PIDN : AAA72945 . 1 ; PID:g 
C; Superfamily: hepatitis C virus genome polyprotein 

C; Keywords: ATP; capsid protein; envelope protein; glycoprotein; hydrolase; nonstructural protein; nucleoti 

F;2-115/Product: capsid protein C #status predicted 

F; 1 1 6- 191/ Product : envelope protein M #status predicted 

F; 192 -38 9/ Product : major envelope protein E- #status predicted 

F; 390-729/Product: nonstructural protein NS1 tfstatus predicted 

F;730-1006/Product: nonstructural protein NS2 flstatus predicted 

F; 1007-1 61 5/Product : hepacivirin is'tatus predicted 

F; 12 30- 12 37 /Region : nucleotide-binding motif A (P-loop) 

F; 13 12-1 317 /Region : nucleotide-binding motif B 

F;1316-1319/Region: DEXH motif 

F; 1616-1862/Product : nonstructural protein NS4a flstatus predicted 
F; 1863-2013/Product : nonstructural protein NS4b flstatus predicted 
F;2014-3010/Product: nonstructural protein NS5 tfstatus predicted 

F; 196, 209, 234, 250, 305, 325, 417, 423, 430, 44 8, 532, 540, 556, 576, 623, 64 5, 1213, 1255, 2041, 2077, 2240, 2529, 2788/Bindin 

Query Match 2.3%; Score 115.5; DB 1; Length 3010; 

Best Local- Similarity 19.7%; Pred. No. 1.7; 

Matches 163; Conservative 83; Mismatches 285; Indels 295; Gaps 40; 



Qy 


128 


SAARDRTDMHRTWRETFLDNLRAAGLCVDQ QD VQDGNT T V H YAL LS AS WA VL C Y Y 


182 


Db 


314 


II II II III ill 1 : :l MM 
SGHRMAWDMMMNWSPT TALWSQLLRI PQAWDMVAGAHWGVL AGLAYY 


362 


Qy 


183 


AEDLRLKLPLQELPNQASNWSAGLLAWLGIPNVLLEVVPDVPPEYYSCRFRVNKLPRFLG 


242 


Db 


363 


: 1 1 1 : I: 1 1 1 
SMAGNWAKVL I VML LFAG 


380 


Qy 


243 


SDNQDTFFT STKRHQILF EI LAKT PYGH EKKNLLGI HQLLAEGVLS 


288 


Db 


381 


1 II 1 :| 1 :| ::: 1 : I : I I |: 
VDG-DTHVTGGAQAKTTNRLVSMFASGPSQKIQLINTNGSWHINRTALNCNDSLQTGFLA 


439 


Qy 


289 


AAFPLHDGPFKTPPE GP QAPRLNQRQVLFQH 

III II II :: | : | | : : 
ALFYTHSFNSSGCPERMAQCRTIDKFDQGWGPITYAESSRSDQRPYCWHYPPPQCTIVPA 


319 


Db 


440 


499 


Qy 


320 


WARWGK WNKYQPLDHVRRYFGEKVALY 

III: 1 : 1 : 1 1 
SEVCGPVYCFTPSPVWGTTDRFGVPTYRWGENETDVLLLNNTRPPQ— GNWFG 


346 


Db 


500 


551 


Qy 


347 


FAWL GFYTGWLLPAAWG TLVFLVGCFLVFSD I PTQELCGSKDS FEMCPLCL 

! : II 1 : 1 II II : 1 III. : I I.: 
CTWMNSTGFTFCTCGGPPCNIGGVGNNTLTCPTDCFRKHPE-ATYTKCGSGP— WLTPRCM 


398 


Db 


552 


608 


Qy 


399 


-DCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDY 

II: | | : : || I : : | | : | : | : | | | : 
VDYPY RLWHYPCTVN FT I FKVRMYVGGVEH — RLNA — ACNWTRGER 


457 


Db 


609 


651 


Qy 


458 


EDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSWIVA/MVAVVVMCLVS 
1 CI II: : 1 : II 1 ::| 1 : : :|| 
CDLEDRDRPELSPLLLSTTEWQVLPCSFTTLPALSTGLIHLHQNIVDVQYLYGIGSAWS 


517 


Db 


652 


711 


Qy 


518 


IIL YRAIMAIWSRSGNTLLAAWAS-RIASLTGSWNLVFILILSKIYVSLAHVLTR 

: 1 :: :::: 1 II :: III :|: |: II 
FAIKWEYVLLLFLLLA-DARVCACLWMMLLIAQAEAALENLV VLNSASVAGAH 


573 


Db 


712 


763 


Qy 


574 


WEMHRTQTKFEDAFTLKVFI FQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGV 

1 1 : 1 : : II 1 1 1 1 II : I : 1 1 
GILS FLVFFCAAWY I — - KGRLV- - PGAT YALYGVWPLLLLL 


626 


Db 


764 


800 


Qy 


627 


RN E E C AAGGC L I EL AQE L L V I MVG KQ V — IN NMQE VL I P KL KGWWQK FR 


673 



: I I : :: : I : :| III 
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Db 801 LALPPRAYAMDREMAASCGG AVFVGLVLLTLSPYYKVFLARLIWWLQYFT 850 

Qy 674 LRSKKRKAGASAGASQGPW EDDYELVPC EGLFDEYLEMVLQFGFVT I 720 

I II I I I : I 1:11 :: I : : 

Db 851 TR AEADLHVWIPPLNARGGRDAIILLMCAVHPELIFDITKLLIAILGPLMV 901 

Qy 721 FVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGIWFHILAGLT H 774 

I II : I I I I I : : I I I I I 

Db 902 LQAGITRVPYF VRAQGL I H ACML VRKV A - GG H Y VQMA FMKLGALTGTYIYNH 952 

Qy 775 LAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFA 820 

I : : III | : : | | | : 
Db 953 LTPLRD WPRA GLRDLAVAVEPWFS 977 

RESULT 6 
T35404 

probable squalene-hopene cyclase - Streptomyces coelicolor 
C; Species: Streptomyces coelicolor 

C;Date: 05-Nov-1999 #sequence_revision 05-Nov-1999 #text_change 09-Jul-2004 
C;Accession: T35404 

R;01iver, K. ; Harris, D.; Bentley, S.D.; Parkhill, J.; Barrell, B.G.; Ra jandream, M. A. 
submitted to the EMBL Data Library, March 1999 
A/Reference number: Z21577 
A; Accession:' T35404 

A; Status: preliminary; translated from GB/EMBL/DDB J 
A; Molecule type: DNA 
A; Residues: 1-680 

A; Cross-references: UNIPROT:Q9X7V9; UNIPARC:UPI00000DAF4 7; EMBL : AL04 94 85; PIDN :CAB39697 . 1 ; GSPDB : GN0Q070 
A; Experimental source: strain A3 (2) 
C; Genetics : 

A; Gene: SCOEDB : SC6A5 . 13 

C; Superfamily: squalene-hopene cyclase 

Query Match 2.2%; Score 110.5; DB 2; Length 680; 

Best Local Similarity 22.8%; Pred. No. 0.6L; 

Matches 112; Conservative 52; Mismatches 173; Indels 155; Gaps 27; 

MTSETSSGSHCARSRMLRRRAQEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAAC 99 
I I : I : I I I I I I : I I : I : I : I I I 

MTA-TTDGSTGASLRPLAASASDTDITI PAAAAGVPEAAA- 39 

RAGSPAKPRIADFVLV WEEDLKLDRQQDSAARDRTDMHRTWRET FL DN 147 

I I I I : I I : I I : : I I : II : 
RATRRATDFLLAKQDAEGWWKGDL ETNVTMDAEDL LLRQFLGI QDEET 87 

LRAAGLCVDQQDVQDGNTTVHY-ALLSASWAVLCYYAEDLRLKLPLQELPN — QASNW — 202 

I I I ! : : : I I I I : I I I I I I I : : I : I 

TRAAALFIRGEQREDGTWATFYGGPGELSTTIEAYVA--LRLAGDSPEAPHMARAAEWIR 14 5 

SAGLLA WLGI PN-VLLEWPDVPPE — YYSCRFRVNKLPRFLGSDNQDTFFT 251 

I I :| II: : : | :: | | | | : | :: : | | 
SRGGIASARVFTRIWLALFGWWKWDDLPELPPELIYF PTWVPLNIYD — FG 194 

STKRHQI — LFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFP — -LHDGPFKTPPEGPQ 306 

111:111 I I I I 111:11 

CWARQTIVPLTIVSAKRP VRPAPFPLDELHTDPARPNPPRPL 236 

AP RLNQ RQVLFQHWARW GKWNKYQPLDHVR 336 

II I::: I: III I I II 
AP VASWDGAFQR I D KALH AYRKVAPRRL RRAAMN SAARWI IERQEN DGCWGG IQP 291 

RYFGEKVALYFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPL 396 

I : ! : I : I I : : : I I : : | | 
PAVYSVIALYLLGYDLEHPVMRAGLESLDRFAVWRE DGARMIEA 335 

CLDCPFW-LLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATL AY 4 50 

I II : I I I II II I I : I : : II I I : 



: I I :: 



Qy 


40 


Db 


1 


Qy 


100 


Db 


40 


Qy 


148 


Db 


88 


Qy 


203 


Db 


146 


Qy 


252 


Db 


195 


Qy 


307 


Db 


237 


Qy 


337 


Db 


292 


Qy 


397 


Db 


336 


Qy 


451 


Db 


395 



RESULT 7 
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S46584 

probable membrane protein YJL094c - yeast ( Saccharomyces cerevisiae) 
N; Alternate names: hypothetical protein J0909 
C; Species: Saccharomyces cerevisiae 

C;Date: 13-Jan-1995 #sequence_revision 13-Jan-1995 #text_change 09-Jul-2004 
C;Accession: S46584; S56871; S47057 
R;Miosga, T.; Witzel, A.; Zimmermann, F.K. 
Yeast 10, 965-973, 1994 

A;Title: Sequence and function analysis of a 9.46 kb fragment of Saccharomyces cerevisiae chromosome X. 
A; Reference number: S4 6584; MUID: 95076716; PMID: 7985424 
A;Accession: S46584 
A; Molecule type: DNA 
A; Residues: 1-873 

A;Cross-references: UNIPROT : P4 0309; UNIPARC: UPI000013B5C4 ; EMBL: X77087 ; NID:g521093; PIDN : CAA54 359 . 1 ; PID:g 
A; Note: the authors translated the codon TCC for residue 64 5 as Trp 

R;Miosga, T.; Schaaf f-Gers tenschlaeger , I.; Baur, A.; Boles, E . ; Chalwatzis, N . ; Fournier, C; Schmitt, S.; 

submitted to the Protein Sequence Database, September 1995 

A; Reference number: S56855 

A;Accession: S56871 

A; Molecule type: DNA 

A; Residues: 1-873 

A; Cross-references: UNIPARC: UPI00001 3B5C4 ; EMBL:Z49369; NID:gl008267; PID: gl008268 ; MIPS:YJL094c 

C;Genetics: 

A; Gene: SGD : KHA1 

A; Cross-references : SGD:S0003630 

A; Map position: 10L 

C; Keywords: transmembrane protein 

Query Match 2.2%; Score 110.5; DB 2; Length 873; 

Best Local Similarity 21.4%; Pred. No. 0.86; 

Matches 69; Conservative 56; Mismatches 106; Indels 91; Gaps 16; 

Qy 503 VIWMVAVWMCLVSIILYRAI--MAIWSRSGNTLIAA W ASRIASL 547 

I : I : I I : : I I : : : : I : I : I I I I : : : I 

Db 157 VFMVFIAVSISVTAFPVLCRILNELRLIKDRAGIWLAAGIINDIMGWILLALSIILSSA 216 



Qy 548 TGSWNLVFI LI LS K I YV S L AH VLT RWEMH RT QTKFEDAFTLKVFI FQFVNF 599 

II II 1:11::: II I II : II ::| |: : |:: 

Db 217 EGSPWTVYILLITFAWFLIYFFPLKYLLRWVLIRTHELDRSKPSPLATMCILFIMFISA 276 

Qy 600 YSS PVYIAFFKGRFVGYPGNYHTLFGVRNEEC AAGGCL I ELAQ- 642 

I : I *- : I I I I : I I I : I I : : I 

Db 277 YFTDIIGVHPIFGAFIAGLWPRDDHYWKLTERMEDIPNIVFI PIYFAVAGLNVDLTLL 336 

Qy 643 ELLVIMVGKQVINNMQEVLIPKLKG-WWQKFRLRSKKRKAGASAGASQGP 691 

: I : : : I : I I I I : I :: : I I 
Db 337 NEGRDWGYVFAT IGIAI FTKI I SG TLTAKLTGLFWRE ATAAGV 379 

Qy 692 WEDDYELVPCEGLFD-EYLEMVLQFGFVT 1 FVAAC PLAPLFALLNNWVE I RLDAR 745 

I: |:|: : I : I I :: :|| I II:: : I I 

Db 380 LMSCKGIVEIWLTVGLNAGI ISRKIFGMFV LMALVSTFVTTPLTQL 426 

Qy 74 6 KFVCEY RRPVAERAQDIG 763 

: I I: :: |:| | 
Db 427 VYPDSYRDGVRKSLST PAEDDG 4 48 



RESULT 8 
T0O487 

probable potassium transport protein F19I3.29 - Arabidopsis thaliana 
C; Species: Arabidopsis thaliana (mouse-ear cress) 

C;Date: 12-Feb-1999 #sequence_revision 12-Feb-1999 #text_change 09-Jul-2004 
C;Accession: T00487; B84764 

R;Rounsley, S.D.; Lin, X.; Ketchum, K.A.; Crosby, M.L.; Brandon, R.C.; Sykes, S.M.; Kaul, S.; Mason, T.M.; 
submitted to the EMBL Data Library, April 1998 

A; Description: Arabidopsis thaliana chromosome II BAC F19I3 genomic sequence. 
A; Reference number: Z14160 
A;Accession: T00487 

A; Status: translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A;Residues: 1-792 

A;Cross-references: UNIPROT:064769; UNIPARC : UPI0000048 5F4 ; EMBL: AC004238; NID: g3033373; PID:g3033401 
A; Experimental source: cultivar Columbia 

R;Lin, X.; Kaul, S.; Rounsley, S.D.; Shea, T.P.; Benito, M.I.; Town, CD.; Fujii, C.Y.; Mason, T.M.; Bowman 
Nature 402, 761-768, 1999 

A;Title: Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana. 
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A;Reference number: A84420; MUID : 20083487; PMID : 106171 97 
A; Accession: B84764 
A; Status: preliminary 
A; Molecule type: DNA 
A; Residues : 1-792 

A; Cross-references: UNIPARC: UPI00000485F4 ; GB:AE002093; NID : g30334 01 ; PIDN : AAC1284 5 . 1 ; GSPDB :GN00139 
C;Genetics : 

A;Gene: At2g35060; F19I3.29 
A; Map position: 2 

A;Introns: 50/3; 126/1; 208/1; 225/1; 312/1; 368/1; 627/2 
C;Superfamily: barley probable potassium transport protein HAK1 

Query Match 2.2%; Score 110; DB 2; Length 7 92; 

Best Local Similarity 18.5%; Pred. No. 0.83; 

Matches 122; Conservative 94; Mismatches 194; Indels 250; Gaps 28; 

Qy 60 AQEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEED 119 

I : I : I : :: I : : : I I I : | : | 

Db 3 ARVEAATMGGEI DEEESDERGS MWDLD 29 

Qy 120 LKLDRQQDSAARDRTDMHRTWRETFL DNLRAAGLCV 155 

111:11 : | : | : : | ' :| : I I 

Db 30 QKLDQSMDEEAGRLRNMYREKKFSALLLLQLSFQSLGWYGDLGTSPLYVFYNT FPHGIK 8 9 

Qy 156 DQQDVQDGNTTVHYAL LSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLAWLG 211 

I : I : : : I : I* I I : I I I ill 
Db 90 DPEDIIGALSLI IYSLTLIPLLKYVFWC-KAND NGQGGTFA 130 

Qy 212 I PNVLLEWPDVPPEYYS — CRF-RVNKLPRFLGSDNQDT FFTSTKRHQI LFEI LAKT PY 2 68 

II II :| : :| : | :: | I : I | | | 

Db 131 LYSLLCRHAKVKTIQNQHRTDEELTTYSRTTFHEHSF — AAKTKR 173 

Qy 269 GHEKKN LLGIHQLLAEGVLSAAFPLHD — GPFKTPPEGPQAPRLNQRQVL 316 

II: I : I : : : I : I : | : | : | : : | : 

Db 174 WL EKRT S RKT AL L I L VL VGT CMV I GD G I LT P A I S VL S AAGGL RV NLPHISNGWV 228 

Qy 317 F QHWARWGKWNKYQPLDHVRRYFGEKVALYF AWLGFYTGWLLPAA 361 

I II: I I I I I: I I :| I I : 

Db 229 FVAWILVSLFSVQHYG TDRVGWLFAPIVFLWFLSIASIGMYNIWKHDTS 278 

Qy 362 W GTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLC 397 

I : I : : : | : | : | : : : : | : 
Db 27 9 VLKAFSPVYIYRYFKRGGRDRWTSLGGIMLSITGIEALFADLSHFPVSAVQIAFTV 334 

Qy 398 LDCPFWLLSS AC AL AQ AGRL FD H -GGTVFFSLFMALWAVLLLEYWKRKSATL 4 48 

: I I I : : I III I : I : : : I : I : : I II 

Db 335 IVFPCLLLAYSGQAAYIRRYPDHVADAFYRSIPGSVYWPMFI IATAAAIVASQATISATF 394 

Qy 449 AYRWDCSDYEDTEERPRPQFAASA PMTAPN PITGEDEPYF 4 88 

II II : :: : |: ■ :| I 
Db 395 SLVKQALAHGCF PRVKWHTSRKFLGQI YVPDINWILMILCIAVTAG F 4 42 

Qy 489 PERSRARRMLAGSWIWMVAV^A/MCLVSI ILYRAIMAIWSRSGNTLLAAWASRIASLT 548 

:|: :llll::l ::| I: |:::| :| I' 
Db 4 43 KNQSQIGNAYGTAWIVMLVTTLLMTLIMILVWRCHWVLV LI 4 84 

Qy 54 9 GSWNLV FILILSKI YVS LAH VLT RWEMH RT QT KFED AFT L KV F I FQ 595 

: I : : I I I : I M : : I : I I I Mi : I : 

Db 485 FTVLSLWECTYFS AMLFKI DQGGWVPLVI AAAFLL IMWVWH YG TLKRYEFE 536 



RESULT 9 
A45573 

genome polyprotein - hepatitis C virus {strain JT) 

N;Contains: capsid protein C; envelope protein M; hepacivirin (EC 3.4.21.98) (nonstructural protein NS3); m 
C; Species: hepatitis C virus 

C;Date: 19-May-2000 #sequence_revision 19-May-2000 #text_change 09-Jul-2004 
C; Access ion: A4 557 3 

R;Tanaka, T.; Kato, N.; Nakagawa, M. ; Ootsuyama, Y.; Cho, M.J.; Nakazawa, T.; Hijikata, M.; Ishimura, Y.; S 
Virus Res. 23, 39-53, 1992 

A;Title: Molecular cloning of hepatitis C virus genome from a single Japanese carrier: sequence variation w 

A;Reference number: A45573; MUID: 92295714 ; PMID: 1318627 

A;Accession: A45573 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-3010 
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A/Cross-references: UNIPROT:Q00269; UNIPARC:UPI0000131E29; GB:D11168; GB:D01171; NID:g221612; PIDN : BAA0194 3 
A; Experimental source: HCV-JT 

A;Note: sequence extracted from NCBI backbone (NCBIN : 106206, NCBIP: 106207 ) 
C; Superfamily: hepatitis C virus genome polyprotein 

C; Keywords: ATP; glycoprotein; hydrolase; nucleotide binding; P-loop; polyprotein; serine proteinase; trans 

F; 2-115/Product : capsid protein C ^status predicted 

F; 116-191/Product: envelope protein M flstatus predicted 

F; 192-389/Product: major envelope protein E tfstatus predicted 

F;390-729/Product: nonstructural protein NS1 #status predicted 

F;730-1006/Product: nonstructural protein NS2 flstatus predicted 

F; 1007-1615/Product : hepacivirin #status predicted 

F;1230-1237/Region: nucleotide-binding motif A (P-loop) 

F;1312-1317/Region: nucleotide-binding motif B 

F;1316-1319/Region: DEXH motif 

F; 1616-1862/Product : nonstructural protein NS4a #status predicted 
F; 1863-2013/Product: nonstructural protein NS4b #status predicted 
F;2014-3010/Product: nonstructural protein NS5 #status predicted 

Query Match 2.2%; Score 108; DB 1; Length 3010; 

Best Local Similarity 20.4%; Pred. No. 7.4; 

Matches 160; Conservative 86; Mismatches 276; Indels 262; Gaps 41; 

Qy 128 SAARDRTDMHRTWRETFLDNLRAAGLCVDQ QDVQDGNTTVHYALLSASWAVLCYY 182 

I I I I I I III III I : :| I I I I 
Db 314 SGHRMAWDMMMNWSPT TALWSQLLRIPQAWDMVAGAHWGVL AGLAYY 362 

Qy 183 AEDLRLKLPLQELPNQASNWSAGLLAWL GIPNVLLEWPDVPPEYYSCRFRVNKLPR 239 

: I I : I : I |: I |: 
Db 363 SMVGNW AKVL I VML L F AGVD GVT YTTG 389 

Qy 24 0 FLGSDNQDT FFTSTKRHQI LFEI LAKT PYGHEKKNLLGI HQLLAEGVLSAAFPLH 2 94 

I I : I III :| ::: I : I :: I I : I I I 

Db 390 — GSQARHTQSVTSFFTQGPAQRI — QLINTNGSWHINRTALNCNESLNTGFFAALFYAH 445 

Qy 295 DGPFKTPPE GP QAPR-LNQRQVLFQHWA 321 

II II 111:11 : I : I 

Db 44 6 KFNSSGCPERMASCSSIDKFAQGWGPITYTEPRDLDQRPYCW-HYAPRQCGIVPASQVCG 504 

Qy 322 RWGK WNKYQPLDHVRRYFGEKVALYFAWL- 350 

II I :| :|| |: 
Db 505 PVYC FT PS PWVGTTDRS GAPT YNWGAN ET DVLLLNNT RP PQ--GNWFG CTWMN 556 

Qy 351 — GFYTGWLLPAAWG TLVFLVGC FLVFSD I PTQELCGSKDS FEMC PLCL - DC PF 4 02 

II I : I II II : I III : | | : | | : 

Db . 557 STGFTKTCGGPPCNIGGVGNLTLTCPTDCFRKHPE-ATYTKCGSGP — WLTPRCIVDYPY 613 

Qy 403 WLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEE 4 62 

II:: II |::| I :|: III I I : I I : 

Db 614 RLWHYPCTVNFTIFKVRMYVGGVEH — RLSA — ACNWTRGERCDLED 656 

Qy 463 RPRPQ FAASAPN1TAPNPITGEDEPYFPERSRARRMLAGSWIWMVAVWMCLVS 517 

II: : : II I II I ::| I : : : I I 

Db 657 RDRSELSPLLLSTT EWQTLPCSFT TLPALSTGLI HLHQNIVDVQYLYGIGSAWS 711 

Qy 518 IILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNLVFILILSKIYVSLAHVLTRWEMH 577 

:: : :: I I I I II :|:::: :| ::: :: 
Db '712 FVIKWEYIVLLF LLLADARVCACLW MMLL I AQAEAALEN LW LN 755 

Qy 57 8 RTQTKFEDAFTLKVFIFQFVNFYSSPVYIAFFKGRFVGYPGNYHTLFGV 626 

I I I : I : : I I I I I I I I : I : I I 
Db 756 AASLAGADG ILSFLVFFCAAWYI KGRLV — PGAAYALYGVWPLLLLLLALP 804 

Qy 627 RNEECAAGGCLI ELAQELLVIMVG — KQVINNMQEVLI PKLKGWWQKFRLRSK 677 

I : I I I: II :: : I : : | III I : : 

Db 805 PRAYAMDREMAASCGG WFVGLILLTLSPHYKVFLARLIWWLQYFITRAE 854 

Qy 678 KRKAGASAGASQGPWEDDYELVPC EGLFDEYLEMVLQFGFVTIFVAACPLAPLFAL 733 

: I 1:1 I :|| :: I : : | | -| | 

Db 855 AHLCVWVPPLNVRGGRDAIILLTCAAHPELIFDITKLLLAILGPLMVLQAAITAMPYFVR 914 

Qy 734 LNNWVE I RLDARK FVCEYRRPVAERAQDIGIWFHILAGLT 773 

: : II :| :: |: II II III 
Db 915 AQGL I RACMLVRKVAGGH YVQMAFMKLAALTGTYVYDHLT PL QD WAH--AGLR 965 

Qy 774 HLAV 777 

I I I 
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Db 



966 DLAV 969 



RESULT 10 
T11129 

cytochrome-c oxidase (EC 1.9.3.1) chain I - acorn worm mitochondrion 
C; Species: mitochondrion Balanoglossus carnosus 

C;Date: 16-Jul-1999 #sequence_revision 16-Jul-1999 #text_change 09-Jul-2004 
C;Accession: T11129 

R;Castresana, J.; Feldmaier-Fuchs, G. ; "Yokobori, S.; Satoh, N . ; Paabo, S. 
Genetics 150, 1115-1123, 1998 

A;Title: The mitochondrial genome of the hemichordate Balanoglossus carnosus and the evolution of deuterost 
A;Reference number: Z17250; MUID : 99016090; PMID:9799263 
A;Accession: T11129 

A; Status: preliminary; translated from GB/EMBL/DDBJ 
A;Molecule type: DNA 
A; Residues: 1-519 

A; Cross-references: UNIPROT:063612; UNIPARC : UPI000008CD4 9; EMBL : AF051097 ; NID:g3065680; PID : g3065692 ; PIDN: 
C;Genetics : 

A; Genome: mitochondrion 

C; Superfamily: cytochrome-c oxidase chain I; cytochrome-c oxidase chain I homology 

C; Keywords: copper; electron transfer; heme; iron; magnesium; membrane-associated complex; metalloprotein; 
F; 11-457/Domain : cytochrome-c oxidase chain I homology 

F; 61, 378/Binding site: heme a . iron (His) (axial ligands) #status predicted 
F;240,290,291/Binding site: copper (His) #status predicted 
F; 24 0-24 4 /Cross- link: 1 1 -histidyl-3' -tyrosine (His-Tyr) tfstatus predicted 
F; 24 4/Binding site: oxygen (Tyr) #status predicted 

F; 368/Binding site: magnesium (His) (shared with chain II) #status predicted 
F;376/Binding site: heme a3 iron (His) (axial ligand) flstatus predicted 

Query Match 2.2%; Score 106.5; DB 2; Length 519; 

Best Local Similarity 17.8%; Pred. No. 0.93; 

Matches 94; Conservative 51; Mismatches 167; Indels 215; Gaps 18; 



Qy 


472 


APMTAPNP ITGEDEPYFPERSRARRMLAGSWIVVMVAVWMCLVS 1 1 LYRAIMAI WSR 


531 


Db 


39 


1 : 1 1: |:|: 1 |:| :::: :| |:: 
AELAQPGPLLGDDQIY : NVIVTAHAFVMI FFMVMPIMIGG 


77 


Qy 


532 


SGNTLL ' AAW ASRIA 


545 


Db 


78 


1 1 1 1 11:1 
FGNWLLPLMLGAPDMAFPRLNNMSFWLLPPSFLLLLSSAGVESGVGTGWTVYPPLAGNMA 


137 


Qy 


546 


SLTGSWNLVFILIL SKIYVSLAHVLTRWEMHRTQTKFE — DAFTLKVFIFQFVNFY 

Ml :| I 1 i 1 :: : | | :|: | Ml : 
HAGGSVDLAI FSLHLAGI SS ILGAINFMTTVINMRAPGVRFDRLPLFWJSVFITVI LLLL 


600 


Db 


138 


197 


Qy 


601 


1 II 


605 


Db 


198 


SLPVLAGAITMLLTDRNLNTSFFDPAGGGDPILYQHLFWFFGHPEVYILILPAFGMISHV 


257 


Qy 


606 


IAFFKGRF — VGYPGNYHTLFGVRNEECAAGGCLIELAQELLVIMVGKQVINN M 

II 1: 1: |||::: III : 1 1 1 
IAFYSGKKEPFGYLGMVYAMIAI GI LG FL VWAH HMFT VGMD VDT RAY FT AAT 


657 


Db 


258 


309 


Qy 


658 


QEVLIP KLKGWWQKFRLRSKKRKAGASAGASQGPWEDDYELVPCEGLFDEYLEMVLQ 

: :| I: 1* hill . :: 
MVIAVPTGIKIFSWL ATLHGSALQWE APLLWA 


714 


Db 


310 


341 


Qy 


715 


FGFVTIFVAACPLAPLFALLNNWVEIRLDARKFVCEYRRPVAERAQDIGI WFHI 

III :| ||::::: :| : | || || : 
LGFVFLFTVGGLTG — IVLSNSSLDWMHDTYYVVAHFHYVLSMGAVFGI FAGFIHWFPL 


768 


Db 


342 


399 


Qy 


769 


LAGLTHLAVISNAFLLAFSSDFLPRAYYRWTRAHDLRGFLNFTLARAPSSFAAAHNRTCR 
1 1 1 1 1 1 : 1 Mill 1 
FTGLTLHPV WTKFHFWMFLGVNLTFFPQHFLGLAGMPRR 


828 


Db 


400 


439 


Qy 


829 


YRAFRDDDGHYSQTYWNLLA IRLAFVIVFEHW FSVGRL 867 

1 : 1 : 1 1 1 : M : . 1 1 I 1 : I : : I : II 
YSDYPD AYTTWNVLSSVGSI VSLASVI I FLAI LWEAFTARRL 4 81 




Db 


440 





RESULT 11 
B86088 

probable citrate permease Z5523 [imported] - Escherichia coli (strain 0157 :H7, substrain EDL933) 
C; Species: Escherichia coli 

C;Date: 16-Feb-2001 #sequence_revision 16-Feb-2001 #text_change 09-Jul-2004 
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C; Accession: B86088 

R;Perna, N.T.; Plunkett III, G.; Burland, V.; Mau, B.; Glasrier, J.D.; Rose, D.J.; Mayhew, G.F.; Evans, P.S. 
Nature 409, 529-533, 2001 

A;Title: Genome sequence of enterohemorrhagic Escherichia coli 0157 :H7. 

A;Reference number: A85480; MUID : 2 1074 935; PMID : 11206551 

A; Accession: B86088 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-4 38 

A; Cross-references: UNIPROT :Q8X4 P7; UNIPARC: UPI00000D0D9E; GB:AE005174; NID:gl2518889; PIDN : AAG591 66 . 1 ; GSP 
A; Experimental source: strain 0157 :H7, substrain EDL933 
C;Genetics: 
A;Gene: Z5523 . 

C;Superfamily: citrate utilization determinant 

Query Match 2.1%; Score 106; DB 2; Length 438; 

Best Local Similarity 23.3%; Pred. No. 0.82; 

Matches 57; Conservative 31; Mismatches 69; Indels 88; Gaps 13; 

"GEKVAL-YFAWLGFYT GWLLPAAWGTLVFLVGCFLVFS DIP T 381 

I I I I I I I I I II : I : I I : I II : I I I : I 



I : : I : ! : : I II I I I : I I I II I 

^SESAFSL LMQHKATIAN-GILLAIGSTVATYISLFYYGTWAAKYL 280 



I I II : I 
GMNQNY SHAAMLL 293 

VVWMC LVSIILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNL 554 

I : : I : I I : I : : I I I : I I : : : 



Qy 


339 


Db 


173 


Qy 


382 


Db 


232 


Qy 


439 


Db 


281 


Qy 


499 


Db 


294 


Qy 


555 


Db 


34 9 


RESULT 


12 


E91240 





II:: 



probable membrane transport / symporter protein ECs4893 [imported] - Escherichia coli (strain 0157 :H7, subs, 
C; Species: Escherichia coli 

C;Date: 18-Jul-2001 #sequence_revision 18-Jul-2001 #text_change 09-Jul-2004 
C;Accession: E91240 

R;Hayashi, T.; Makino, K. ; Ohnishi, M. ; Kurolcawa, K. ; Ishii, K. ; Yokoyama, K . ; Han, C.G.; Ohtsubo, E. ; Naka 
DNA Res. 8, 11-22, 2001 

A;Title: Complete genome sequence of enterohemorrhagic Escherichia coli 0157 :H7 and genomic comparison with 

A; Reference number: A99629; MUID : 21156231; PMID: 11258796 

A;Accession: E91240 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-438 

A; Cross-references: UNIPROT : Q8X4 P7 ; UNIPARC: UPI00OOOD0D9E; GB:BA000007; PIDN :BAB38316 . 1; PID:gl3364369; GSP 

A; Experimental source: strain 0157 :H7, substrain RIMD 0509952 

C; Genetics: 

A; Gene: ECs4 893 

C; Superfamily: citrate utilization determinant 

Query Match 2.1%; Score 106; DB 2; Length 4 38; 

Best Local Similarity 23.3%; Pred. No. 0.82; 

Matches 57; Conservative 31; Mismatches 69; Indels 88; Gaps 13; 

IKVAL-YFAWLGFYT GWLLPAAWGTLVFLVGCFLVFS DIP T 381 

Ml I I I I I 11:1 : I I : I I I : I I I : I 



I : : I I I 111:111 II I 

-LMQHKATIAN-GILLAIGSTVATYISLFYYGTWAAKYL 280 



I :: I I I :| 
-GMNQNY SHAAMLL 293 



Qy 


339 


Db 


173 


Qy 


382 


Db 


232 


Qy 


439 


Db 


281 



http://es/ScoreAccessWeb/GetItem.action?AppId=10552515&seqId=775629&ItemName... 11/17/2006 



Qy 4 99 AGSWI WMV AVWMC LVSI ILYRAIMAIWSRSGNTLLAAWASRIASLTGSWNL 554 

I I : I : I : : I : M : I : : | I I : I | : : : 

Db 294 AGVI TFVGALLVGMLCDS VGRKKL I L I S RVMVL I CS WPS FWLLVNY PS PGMLLTV 34 8 

Qy 555 VFILI 559 

II::: 

Db 34 9 VFVMV 353 



RESULT 13 
JC1346 

dopamine beta-monooxygenase (EC 1.14.17.1) precursor - mouse 
C; Species: Mus musculus (house mouse) 

C;Date: 30-Sep-1993 #sequence_revision 30-Sep-1993 #text_change 09-Jul-2004 
C; Accession: JC134 6 

R;Nakano, T.; Kobayashi, K.; Saito, S.; Fujita, K. ; Nagatsu, T. 
Biochem. Biophys. Res. Commun. 189, 590-599, 1992 

A; Title: Mouse dopamine beta-hydroxylase : primary structure deduced from the cDNA sequence and exon/intron 
A; Reference number: JC1346; MUID : 93080618 ; PMID: 1280432 
A; Accession: JC134 6 
A; Molecule type: mRNA 
A; Residues: 1-621 

A;Cross-references: UNIPROT:Q64237; UNIPARC : UPI0000029950; GB:S50200; NID:g260872; PIDN: AAB24330 . 1 ; PID:g26 
C;Comment: This enzyme catalyzes the hydroxylation of dopamine to norepinephrine. 
C; Gene tics: 

A;Introns: 117/3; 166/3; 252/3; 346/1; 401/3; 449/3; 462/3; 482/3; 525/2; 578/3 

C; Keywords: catecholamine biosynthesis; copper; glycoprotein; monooxygenase; oxidoreductase; phosphoprotein 
F; 1-43/Domain: (or 1-46) signal sequence #status predicted 

F; 4 4 -621/Product : (or 47-621) dopamine beta-monooxygenase #status predicted 
F; 300-52 3/Doma in : peptidylglycine monooxygenase I homology 

F; 68, 188,476, 570/Binding site: carbohydrate (Asn) (covalent) #status predicted 

F; 350, 528/Binding site: phosphate (Ser) (covalent) (by calmodulin-dependent kinase II) #status predicted 

Query Match 2.1%; Score 105; DB 2; Length 621; 

Best Local Similarity 20.2%; Pred. No. 1.6; 

Matches 113; Conservative 61; Mismatches 187; Indels 198; Gaps 29; 



Qy 


110 


ADFVLVWEE DLKLDRQQD SAARDRTDMHRTWRETFLDNLRA 

1 1 : : : 1 : : 1 1 1 1 1 III: :: | 
ADLIMLWSDGDRAYFADAWSDRKGQIHLDSQQDYQLLQAQRTRDGLSLLFKRPF 


150 


Db 


102 


155 


Qy 


151 


AGLCVDQQDVQDGNTTVH — YALLSASWAVLCYYAEDLRLKLPLQELPNQASNWSAGLLA 

: 1 :| : III 1 :| III :| 1 1 
— VTCDPKDY VI EDDTVHLVYGI LEE PFQSL — EAINTS 


208 


Db 


156 


190 


Qy 


209 


WLGIPNVLLEV VPDVPPEYYSCRFRVNKLPRFLGSDNQDTFF TS 

1: III 1 :| : : 1 1 1 II: |:: 
— GLHTGLLRVQLLKSEVPTPSMPEDVQTMDIRA PDILIPDNEQTYWCYITELPPRF 


252 


Db 


191 


245 


Qy 


253 


TKRHQILFEILAKTPYGHEKKNLLGIHQLLAEGVLSAAFPLHDGP — FKTPPEGPQAPRL 
: 1 |::| : 1 : : : 1 II 1 II Ml II: II 
PRHHIIMYEAIV-TEGNEALVHHMEVFQCAAE SEDFPQFNGPCDSKMKPD RL 


310 


Db 


246 


296 


Qy 


311 


N Q RQ VL FQ HWARWG KWN KYQ P L D HVRRYFGEK VAL 


345 


Db 


297 


1 : : II 1 1 1 : : |: I : : | 
NYCRHVLAAWALGAK-AFYYPKEAGVPFGGPGSSPFLRLEVHYHNPRKIQGRQDSSGIRL 


355 


Qy 


346 


-YFAWLGFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQELCGSKDSFEMCPLCLD-CPFW 

Mil :: :| 1 : . II II :| : III 
PYTATLRRYDAGIMELGLVYTPLMA ' 1 PPQE TA FVLT GYCT DKCT QM 


403 


Db 


356 


401 


Qy 


404 


LLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKRKSATLAYRWDCSDYEDTEER 
1 : 1 : 1 1 1 1 1 : 1 1 : 1 1 
ALQDSGIH I FASQLHTH LTGRKWTVLAR DGQER 


463 


Db 


402 


435 


Qy 


464 


PRPQFAASAPMTAPNP ITGEDEPYFPERSRARRMLAGS WI WMVAVWMCLVS 1 1 LYRA 

III 1 1 1 1 1 .: :: 1 
KE VNRDNHYSP-HFREIRMLKKWTVYPGDVLITSC 


523 


Db 


436 


470 


Qy 


524 


IMAI WSRSGNTLLAAWASRIASLTGSVVNLVFI LI LSKI YVSLAHVLTRWEMHRTQTKF 

: II 1:: 1 II :: |: 1 : |: :: 
TYNTENKTL ATVGG FGILEEMCVNYVHYYPQTELELCKSAV 


583 


Db 


471 


511 


Qy 


584 


EDAFTLKVFI FQFVNFYSS 602 




Db 


512 


:| 1 1 1 11 : M 
DDGFLQK — YFHMVNRFSS 528 
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RESULT 14 
H82555 

c-type cytochrome biogenesis membrane protein XF24 60 [imported] - Xylella fastidiosa (strain 9a5c) 
C; Species: Xylella fastidiosa 

C;Date: 18-Aug-2000 #sequence_revision 20-Aug-2000 #text_change 09-Jul-2004 
C;Accession: H82555 

R; anonymous, The Xylella fastidiosa Consortium of the Organization for Nucleotide Sequencing and Analysis, 
Nature 406, 151-157, 2000 

A;Title: The genome sequence of the plant pathogen Xylella fastidiosa. 

A; Reference number: A82515; MUID: 20365717; PMID: 10910347 

A;Note: for a complete list of authors see reference number A59328 below 

A;Accession: H82555 

A; Status: preliminary 

A; Molecule type: DNA 

A; Residues: 1-64 6 

A; Cross-references: UNIPROT:Q9PAN5; UNIPARC : UPI00000C2A60; GB:AE004054; GB:AE003849; NID :g910764 5; PIDN : AAF 
A; Experimental source: strain 9a5c 

R;Simpson, A.J.G.; Reinach, F. C . ; Arruda, P.; Abreu, F.A.; Acencio, M. ; Alvarenga, R.; Alves, L.M.C.; Araya 
submitted to GenBank, June 2000 ' 

A;Authors: Ferreira, V.C.A.; Ferro, J. A.; Fraga, J.S.; Franca, S.C.; Franco, M.C.; Frohme, M. ; Furlan, L.R. 

A;Authors: Martins, E.M.F.; Matsulcuma, A.Y.; Menck, C.F.M.; Miracca, E.C.; Miyaki, C.Y.; Monteiro-Vi torello 

A;Authors: da Silva, A.C.R.; da Silva, F.R.; da Silva, A.M.; Silva Jr., W.A.; da Silveira, J.F.; Silvestri, 

A; Reference number: A59328 

A; Contents: annotation 

C; Genetics : 

A; Gene: XF24 60 

C; Superfamily: nrfE protein 



Query Match 2.1%; Score 104; DB 2; Length 646; 

Best Local Similarity 18.9%; Pred. No. 2; 

Matches 95; Conservative 46; Mismatches 152; Indels 210; Gaps 19; 



Qy 


167 


VHYALLSASWAVLC Y Y AE DLRLKLPLQELPNQAS N W S A GLLAWLGI 

1 :||: ::|:| III: II 1 : 1 1 111 
VQLSLLAGAFALLTYAFLGNDFSVQYVAENSHSLLP — TLYRSTAVWGAHEGSLLLW 


212 


Db 


45 


99 


Qy • 


213 


PNVLLEWPDVPPE YYSCRFRVNKLPRFLGSDNQDT FFTSTKRHQI LFE I LAKT PYGH EK 

Mi : ■ 1 1 : 1 1 1 : 
— VLL LAGWT AS VAL RS HT L PAT L S A 


272 


Db 


100 


123 


Qy 


273 


KNLLGIHQLLAEGVLSAAFPLHDGPF KTPPEGPQAPRLNQRQVLFQH 

:| 1: 1:1 1 1 1 M III' II : I 


319 


Db 


124 


-RILGVLGLIALGFL-ALILFTSNPFARLLPAVPEGNDLNPLLQDPGMIVHPPLLYAGYI 


181 


Qy 


320 


WARW GKWNKYQPLDHVRRYFG 

IN 1 1 1 1 : 1 
GFAVPFAFAVAVLLEGRIDPTWLRWSRPWTHTAWALLTLGIALGSWWAYYELGWGGWWFW 


340 


Db 


182 


241 


Qy 


341 


EKV— ALYFAWL GFYTGWLLPAAWGTLVFLVGCFLVFSDIPTQE 

: 1 1 : 1 1 1 : 1 1 1 1 : : 1 : 1 III 1 : 1 


383 


Db 


242 


DPVENASFMPWLIGVALIHSQAITDKRGSFTHWTLLLAITAFALALLGTFLVRSSVLT — 


299 


Qy 


384 


LCGSKDSFEMCPLCLDCPFWLLSSACALAQAGRLFDHGGTVFFSLFMALWAVLLLEYWKR 

1 :| 1: | : :| : Ml 
SVHAFAADPV RGAF I L LL I FT L I GGALL L 


443 


Db ■ 


300 


328 


Qy 


444 


KSATLAYRWDCSDYEDTEERPRPQFAASAPMTAPNPITGEDEPYFPERSRARRMLAGSW 

Mil 1 : 1 : 1 II : 1 : : : 
YARRAPQL — TPVT INMQQRFT PVSRETLLLLNNLL 


503 


Db 


329 


362 


Qy 


504 


IWMVAWVMCLVS I ILYRAIMAI WSRSGNTLLAAWASRIA SLTGSWNLVFILI 


559 


Db 


363 


: 1:1:: II 1 :| |: :| |: 
LTCACAMVLL GT L Y PL LADALALGQL S VGP P Y FG PL FT LL 


4 02 


Qy 


560 


L S KI YVS L - AH VLT RWEMH RT QT 581 
:: : 1 1 III: 1 - 
MTPLIVLLPLGPFTRWQREHPST 425 




Db 


403 





RESULT 15 
JQ2034 

RNA-directed RNA polymerase (EC 2.7.7.48) - beet cryptic virus 3 
C; Species: beet cryptic virus 3 

C;Date: 03-May-1994 #sequence_revision 03-May'-1994 #text__change 09-Jul-2004 



http://es/ScoreAccessWeb^ 11/17/2006 



C; Accession: JQ2034 

R;Xie, W.S.; Antoniw, J.F.; White, R.F. 
J. Gen. Virol. 74, 1467-1470, 1993 

A; Title: Nucleotide sequence of beet cryptic virus 3 dsRNA2 which encodes a putative RNA-dependent RNA poly 
A;Reference number: JQ2034; MUID : 93329401 ; PMID:8336129 
A; Accession: JQ2034 
A; Molecule type: mRNA 
A;Residues: 1-478 

A;Cross-references: UNIPROT : Q86632; UNIPARC : UPI00000EE5F9; GB:S63913; NID:g407557; PIDN : AAB27624 . 1 ; PID:g40 
C; Genetics: 

A; Map position: segment RNA2 . 
C;Keywords: nucleotidyltransferase; reverse transcriptase 

Query Match 2.1%; Score 103.5; DB 2; Length 478; 

Best Local Similarity 20.7%; Pred. No. 1.5; 

Matches 53; Conservative 34; Mismatches 96; Indels 73; Gaps 11; 

Qy 59 RAQEEDSTVLIDVSPPEAEKRGSYGSTAHASEPGGQQAAACRAGSPAKPRIADFVLVWEE 118 

: I : I : I II' II : I -III: III I I : : II 

Db 109 KARAFDVNTELDKVPYEQSSSAGYGYRSHKGPPGGE — THMRAISRVKPTLMTAIRPDEE 166 

Qy 119 DLKLDRQQDSAARDRTDMHRTWRETFLDNLRAAGLCVDQQDVQDGNTTV 167 

I : | | : | :: | : | | 
Db 167 GPEYTILESVPDIGYTRTQLADLREKTKVRGVWGRAF 203 

Qy 168 HYALLSASWA VLCYYAEDLRLKLP — LQEL PNQASN WS AGLLAWLG I PN 214 

II I: : I : I : I : | | : : : I I | | : 

Db 204 HY I L I EGT AARPLLEN FMLGTT FMH I GS DPQLSVPRI LHQMKREGS KWLYA- LDWS S FDS 262 

Qy 215 VLLEVVPDVPPEYYSCRFRVNKLPRFLGSDNQDTFFTSTKRHQILFEILAKTPYGHEK — 272 

I :| I I I : |::| :: || |:: : |:| 

Db 263 SVTRFEINCAF — NLLKERIEFPNEET ELAFE-LSRILFKHKKLA 304 

Qy 273 KNLLGI HQLLAEG 285 

I : I I : : I 
Db 305 APDGNIYMIHKGIPSG 320 ' 



Search completed: October 27, 2006, 20:29:40 
Job time : 56 sees 



[ 
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